Let's first make an obvious observation based on the performance figures:
"You called glDrawElements - and now glDrawElementsInstanced - 2800 times per frame, each call rendering a single quad." Right?
And a single quad is what you called an "Entity" and your for loop ranged over 2800 "entities" (i.e. quads). Right?
Instead of calling glDrawElements 2800 times and each call rendering a single quad (or two triangles), you should call glDrawElements exactly once (per material/texture) per frame.
The problem is that you are making _assumptions_ about where performance is going to hurt. And one of your assumptions was that rendering a visible quad takes more time than rendering an invisible quad. That's why you collected all visible quads in a list and then looped over each quad/entity and render it with a single draw call.
The truth/reality is that issuing a single draw call is veeeeery costly. Doing a single call to glDrawElements which renders a million _invisible_ quads is far faster than doing 3000 calls of glDrawElements each rendering 1 _visible_ quad. So the goal is not to reduce the amount of visible vs. invisible quads but rather reducing the amount of _draw calls_ you are doing.
So first actually I lied sorry, there was an FPS increase of about 10 frames after implementing instanced rendering but it wasnt the performance change I had hoped for. I only call glDrawElementsInstanced once per texture, and currently there are no more than 6 textures that are changing(once I implement texture atlases I will of course switch to those). Secondly isnt that what glDrawElementsInstanced is supposed to do? So I store all the information about a quad like position, color, texture coords, etc. and update the relevant information once per object, so really I loop through all the objects changing the information about its world position, color, lighting etc, and store that back in the VAO. Then after doing that it renders all of them with glDrawElementsInstanced. Thats how I currently do it, but it has only provided a small FPS increase but it really may come down to the hardware im testing this on, im running this on the worse laptop possible in order to heavily optimize it so it will run fine on any average computer, so it may just be hardware limitations but if you have any other thoughts im really happy to hear them, thanks for all the help!