macOS terrible drawElements()/drawArrays() performance for voxel chunk rendering

Started by Lightbuffer, November 19, 2022, 08:18:05

Previous topic - Next topic


Hello again! Following from the previous topic, I'm trying to make my voxel game run on macOS (10.15.3). I managed to fix some issues with textures not appearing, quarter size of the window, mouse coordinates, etc. however the main issue is still present, and it's a terrible performance when I try to draw VAO chunks with OpenGL3.3 core profile and LWJGL3.

I'm pretty sure I'm doing something wrong because I tried running Minecraft 1.13+ on this Mac, and it runs fine. Here is how I'm rendering chunks:

1. I construct a VAO with a single VBO with interleaved where every chunk's vertices are between 0,0,0 and 16,16,16.
2. I render chunks from a flat multidimensional indexed array translating every chunk and changing model view matrix uniform.

Whenever I load the world, at first it starts at 36 FPS, and dips to 5 FPS when all of the 8^3 chunks are loaded. There is frustum culling implemented and fully empty chunks are not rendered. I tried following stuff:

1. Changing the window size to smallest. My first theory was that the problem was the framebuffer size, i.e. hardware can't handle so many pixels. I was wrong because resizing the window to something like 100x40 still lags a lot.
2. Enabling GLFW.glfwWindowHint(GLFW.GLFW_OPENGL_DEBUG_CONTEXT, GLFW.GLFW_TRUE), it doesn't log anything when lag occurs.
3. Profiling with VisualVM. VisualVM CPU profiler shows that most of the lag comes from calling glDrawElements.
4. I tried rewriting from glDrawElements() to use glDrawArrays() i.e. getting rid of IBO, no changes.
5. Profiling with Xcode's OpenGL Profiler wasn't possible because once I tried to collect the information it just hangs the java process.
6. I tried profiling with Sysout and nanoTime(), and on average I get 130 chunks rendered in 110ms (max: ~2.1ms, min: ~0.2ms). Tested same on Windows, and it's 0.85ms for 127 chunks (max: 0.0017, min: 0.0003).
7. I tried to changing shaders to output solid color (instead of shading based on normals), that didn't help much, maybe extra 1-2 FPS save. GL20.glUseProgram(0) before drawing chunks helps, but obviously nothing renders, so I don't know what to make out of it.  ;D
8. I tried looking through for any OpenGL errors, there is nothing of that sort.
9. I tried running the game in different LWJGL versions (I use 3.2.1, but I tried 3.2.2 and 3.3.0).
10. I tried removing any AWT related code or enabling AWT headless mode.
11. I tried fixing all GL errors until no errors were reported.
12. I tried disabling V-sync by removing Thread.sleep() and calling GLFW.glfwSwapInterval(0).

My best guess right now is that when I call drawElements() somewhere errors are getting printed. I think that the issue because when a lot of stuff gets printed to the logs, it can have a significant drop in performance. The only idea I can think of is to look up how Minecraft 1.13+ renders chunks, and try to copy the rendering process it uses.

If anyone have any ideas how to fix this, I would really appreciate any pointers, ideas or tips. If there is any information I missed, I would appreciate if you could point it out, and I'll try to include it!

Thank you for attention!


After multiple changes to the code base, I decided to go back and try solving it again, and it magically got fixed! I went back through 200+ commits or something to see what have fixed this issue. What fixed the issue is changing one of the (color) VBO's attributes from 3 unsigned bytes to 4 unsigned bytes...  :o

I don't know what exactly is macOS' issue with it, but my best guess it's due to an odd attribute size? It was 3 floats (vertex), 3 floats (normal), 2 floats (UV), 3 unsigned bytes (color), so that's like 35 bytes. After fix (adding extra byte for color (alpha)) it becomes 36 bytes.

It's sort of makes sense, I guess. Anyways, if anyone encounters this issue, changing VBO attributes total bytes per vertex to an even number, probably.  ;D

If anyone knows why it happens, I would love to hear the reason why.