Is it possible to free all the buffers pointed to by a PointerBuffer?

Started by bcbradle, April 23, 2017, 16:08:44

Previous topic - Next topic

bcbradle

Specifically, without knowing their lengths.

Basically I want to be able to free the pointer buffer and all the things pointed to by it given only the pointer buffer instance.

spasi


bcbradle

Quote from: spasi on April 23, 2017, 19:02:35
Yes, only the pointer is necessary to free an allocation.

how would you do it then?

I mean i know how to free a PointerBuffer instance foo using foo.free(), but what about all the buffers the PointerBuffer instance foo contains?

foo.get() returns a long (presumably the address of the current buffer stored), but can you use that to free anything? I'm looking inside MemoryUtil but can't find anything that would allow me to free an arbitrary address.

I was thinking of maybe doing foo.getByteBuffer().free(), but there isn't any such method (only getByteBuffer(size)). I would have to know the size of the current buffer in bytes in order to free it this way.

It seems kind of odd though. I don't know how memory allocations are done on the native side in lwjgl, but the c function "free" doesn't require a size, only an address (assuming it was previously allocated with malloc, calloc or realloc). So I feel it would be strange if it weren't possible to free memory with just an address in lwjgl, and that is why I'm asking.

spasi

Raw pointers are considered "unsafe" in LWJGL, so methods that accept them are prefixed with 'n'. The method you're looking for is MemoryUtil.nmemFree(long). Example:

PointerBuffer pp = ...;
for ( int i = 0; i < pp.remaining(); i++ ) {
    nmemFree(pp.get(i));
}
memFree(pp);


Quote from: bcbradle on April 23, 2017, 19:18:23I don't know how memory allocations are done on the native side in lwjgl

The MemoryUtil methods directly call the corresponding method of the native memory allocator. The only utility they have is that the memory allocator is configurable; you can try another implementation without changing your program. By default, if the bindings are available, jemalloc will be used as the memory allocator. Otherwise, the default system allocator is used. I would like to try rpmalloc soon, it'd be nice to have a third option if it's competitive. Btw, you also have the option of using one allocator via MemoryUtil and another by using the corresponding bindings directly (via org.lwjgl.system.libc.LibCStdlib and org.lwjgl.system.jemalloc.*).

bcbradle

Sweet thanks for all your help, vulkan in clojure is going to rock :)

Rampant Pixels

Quote from: spasi on April 23, 2017, 19:37:04
I would like to try rpmalloc soon, it'd be nice to have a third option if it's competitive.

Sorry for the thread necromancy. I wrote rpmalloc and was just browsing around for places where it is used. Let me know if I can help you in any way to test it, I'm interested in seeing how it performs in real world scenarious outside my benchmarks and my own projects using it.

spasi

I've been waiting for jemalloc 5 to stabilize (5.0.1 has already been released with important fixes) before updating LWJGL to use it. When that happens (hopefully soon), I'll also add rpmalloc. Will post here again when it's available.

spasi

The latest snapshot (3.1.3 build 9) includes jemalloc 5.0.1 and rpmalloc.

rpmalloc needs per-thread setup, so enabling it requires code changes: you call rpmalloc_thread_initialize() on thread start and rpmalloc_thread_finalize() on thread exit. It's simple enough, but it might be automated in the future.

jemalloc's performance is not competitive on Windows anymore, so that platform now defaults to the system allocator. It's recommended to enable to rpmalloc if you can implement the changes mentioned above.

Quote from: Rampant Pixels on July 08, 2017, 21:18:58I'm interested in seeing how it performs in real world scenarious outside my benchmarks and my own projects using it.

Results for a trivial benchmark can be found here. It's N threads doing a small malloc + free in parallel. The overhead for the two JNI calls in the Ryzen machine is about 13ns (included in the benchmark scores).