I'm having some issues managing my FloatBuffers. I need some advice on the most appropriate way to do this before a third attempt sucks another week from me. :-)
Some background; I'm making a voxel game. Not a Minecraft clone. For the purpose of this post however, imagine I am making Minecraft. I have my world split into 16x16x16 chunks of char[] arrays, rendered as VBOs. I only generate the vertex data once, when the world is changed. Not every frame (of course!). I generate my chunks and vertex buffers in separate threads. The main thread is only used for the actual glBufferDataARB() call. This is very quick, and it works very well for me.
The process was;
- BufferUtils.createFloatBuffer()
- a ton of buffer.put(vertexdata)
- glBindBuffer() (with a unique-per-chunk id) and glBufferDataARB()
- buffer = null;
The problem; After a while, my game crashes because it runs out of memory. My java.nio.FloatBuffer's weren't relinquishing their memory when I set them to null. I learned that they use native code, and are immune to GC. That's cool; I'll make a buffer pool. Preallocate as many buffers as is practical / suited to the purpose. I started with 100.
The process then became;
- A ton of BufferUtils.createFloatBuffer() calls to preallocate the buffers into an arraylist of "available" buffers.
- When a chunk needs to generate render data, it asks for a free buffer from the pool. If there isn't one, it waits.
- When it gets a buffer, the chunk will populate it, bind chunk.vbo_id (it's unique VBO id), and then glBufferData()
- The buffer is then returned to the BufferPool, where buffer.clear() is called and it is added back to the list
This never crashes... Which was the intention of the buffer pool. However, when a buffer gets used for a second time; the first chunk's render data goes insane. Does this mean that glBufferData() doesn't send the buffer's vertex data to the GPU? The buffer pool and chunk generation are both thread safe. There is always only one reference to a given Buffer, as it is removed from the list of available buffers when it is requested by a chunk.
It hit me that there are probably much better ways to achieve this. So here I am asking. Please give me any advice you can think of that would help me recycle FloatBuffers for VBOs indefinitely.
You should probably post code for how your buffers are managed.
However, I am having one thought: why preemptively create a bunch of buffers? You'd be much more memory efficient with some lazy loading / time out deletion.
Basically:
- Start out with the buffer pool completely empty.
- When you need a new buffer, check the pool. If a buffer is free, use it, if not, create a new one.
- When you finish using a buffer (dispose of a chunk), push it back into the pool.
Now, if you keep track of how long it has been since you used the buffers in it, you can destroy buffers if they've gone unused for a given period of time. This way you won't be using up an excessive amount of memory if you don't need to.
A LinkedList is very good for implementing both of the above, since you can use it as a stack or a queue extremely efficiently.
I would normally post code too, but its a lot of code in this case. I'm more after some discussion of the concepts involved rather than fixing typos in code.
What you've suggested, is what I would do ideally. But I can't find any way in documentation to destroy the buffers. How would you go about destroying the buffers? I typically just assign things to null (buffer = null;), but in the case of these buffers, that doesn't free the memory. A fellow in IRC said the other day that it's because the buffers use native code, which is in a different memory space than the usual VM and Garbage Collection space.
A suggestion on your current solution (which I believe to be the best I have seen in my time in LWJGL). I am pretty sure that glBufferData is not blocking in that just because the function is returned, doesn't mean the copying to the gpu is complete. OpenGL 4.0 has features to do with synchronization (I think) but seeing as you are using the arb extension, I am assuming you are not catering for 4.0. I do however have a feeling that glMapBuffer and glUnMapBuffer maybe the way to go - I might have read somewhere that they are blocking functions (or does OpenGL create the memory space for it?) Anyway, I just read the specs and there wasn't anything about blocking in there. Failing that there is always glFlush (This is all of coarse assuming that this is your problem)
Anyway, the other thing is that there maybe a way to force java to release the memory space on request using the hackery magic of reflection (I call it hackery because I've never understood why you make an oop language with all these oop restrictions that stop you doing stupid stuff and then just say screw it and stick the reflection api in there). Essentially the finalize method for a direct buffer tells the system to release the memory, but normally the gc is the only one who can do this.
From Stack Overflow - A guy called "Li Pi":
Quote/**
* DirectByteBuffers are garbage collected by using a phantom reference and a
* reference queue. Every once a while, the JVM checks the reference queue and
* cleans the DirectByteBuffers. However, as this doesn't happen
* immediately after discarding all references to a DirectByteBuffer, it's
* easy to OutOfMemoryError yourself using DirectByteBuffers. This function
* explicitly calls the Cleaner method of a DirectByteBuffer.
*
* @param toBeDestroyed
* The DirectByteBuffer that will be "cleaned". Utilizes reflection.
*
*/
public static void destroyDirectByteBuffer(ByteBuffer toBeDestroyed)
throws IllegalArgumentException, IllegalAccessException,
InvocationTargetException, SecurityException, NoSuchMethodException {
Preconditions.checkArgument(toBeDestroyed.isDirect(),
"toBeDestroyed isn't direct!");
Method cleanerMethod = toBeDestroyed.getClass().getMethod("cleaner");
cleanerMethod.setAccessible(true);
Object cleaner = cleanerMethod.invoke(toBeDestroyed);
Method cleanMethod = cleaner.getClass().getMethod("clean");
cleanMethod.setAccessible(true);
cleanMethod.invoke(cleaner);
}
Warning - not tested or even read thoroughly (and as already stated I avoid reflection anyway) But hey, it comes with a nice little explanation packaged with it.
The idea that the glBufferData() may not be blocking, could be the cause of all my problems. Thanks for the insight there. I've been sticking to around OpenGL 2.1, for a happy medium between compatibility and coding features. I saved the black magic snippet from Stack Exchange to use as a last resort.
I think I've come up with a solution that would solve the problem (specifically if it is a blocking problem); Separate the entire VBO/buffer/render data generation stuff into a separate pool of ChunkRenderComponent. Each with one buffer, and its own VBO id. Keep instances of this whole structure in a pool, have a chunk request one as it comes into view, and relinquish as it leaves view. The only gotcha i can think of with this, is that the amount of buffers are then tied directly to the view/draw distance.
I'm actually curious as to how Minecraft handles this particular issue.
Scratch that. After browsing Stack Exchange myself, I've found that glFinish() is designed to do exactly that: it'll block while data is transferred to the GPU. Apparently without glFinish(), the data is only transferred when it is used. At which point my buffers have been recycled already, which would explain the gibberish rendering i get.
Good skills finding that. See, that is the kind of thing that should be in the specifications shouldn't it? I've got so used to using the specifications as my ultimate source of all knowledge that I rarely use other sources anymore.
Suds, you're wrong. BufferData and BufferSubData cannot return before the copy is complete. Note that I said copy, not upload. If the VBO resides on GPU memory, then the GL driver is free to asynchronously upload the data at a later time.
Anytime you pass data to the GL directly (i.e. without Map/Unmap) you can be sure that the driver has copied your data and you're free to reuse the ByteBuffer after the call returns. Buffer(Sub)Data does block for the copy. Immediately binding and using the VBO, before the upload is complete, is not necessarily bad either, because that's another asynchronous operation the driver can queue up and perform later (or block and wait if it can't).
It does sound like you have a memory leak somewhere. I can't see how you could be allocating FloatBuffers so fast that the GC doesn't get to run the direct buffer cleaner. I've never had that issue before. You may need to also have a look at your thread synchronization, what you describe seeing after switching to a pre-allocated pool sounds like a sync issue.