Crash in jemalloc after terminating application

Started by codex, July 30, 2025, 21:01:03

Previous topic - Next topic

codex

I'm experiencing a crash in libjemalloc.so using Lwjgl 3.3.6. It appears to be occurring after my Vulkan application has already terminated, which leads me to believe it's a cleanup issue in jemalloc.

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f114c25933a, pid=93473, tid=93497
#
# JRE version: Java(TM) SE Runtime Environment (23.0.2+7) (build 23.0.2+7-58)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (23.0.2+7-58, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# C  [libjemalloc.so+0x5933a]
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or dumping to /home/codex/java/prj/jmonkeyengine/core.93473)
#
# If you would like to submit a bug report, please visit:
#   https://bugreport.java.com/bugreport/crash.jsp
#

The trouble is, I have no idea what I could be doing that would cause a crash such as this. I'm not getting any errors or warnings from Vulkan or anywhere else apart from this crash, and I haven't gotten any leads from looking at the error file either (it doesn't include a stacktrace like other error files I've dealt with).

Does anyone know what mistakes I should look for in debugging this?

spasi

Hey codex,

Try setting -Dorg.lwjgl.util.DebugAllocator=true. It should report any unreleased memory allocations on process exit. If it crashes before that, you could try manually calling the MemoryUtil.memReport methods to print any active allocations before quitting the application.

codex

Thanks, that helps immensely. The leaks seem to be caused by various (if not all) callbacks I'm using, although I'm not sure why that is causing leakages. Here is one such leak:

[LWJGL] 56 bytes leaked, thread 21 (LwjglVulkanContext), address: 0x76982563FEE0
    at org.lwjgl.system.Callback.create(Callback.java:163)
    at org.lwjgl.system.Callback.<init>(Callback.java:113)
    at org.lwjgl.glfw.GLFWWindowSizeCallback.<init>(GLFWWindowSizeCallback.java:55)
    at com.jme3.system.vulkan.LwjglVulkanContext$3.<init>(LwjglVulkanContext.java:129)
    at com.jme3.system.vulkan.LwjglVulkanContext.glfwInitialize(LwjglVulkanContext.java:129)
    at com.jme3.system.vulkan.LwjglVulkanContext.engineInitialize(LwjglVulkanContext.java:85)
    at com.jme3.system.vulkan.LwjglVulkanContext.run(LwjglVulkanContext.java:63)
    at java.base/java.lang.Thread.run(Thread.java:1575)

The exception refers to this glfw call during initialization:

glfwSetWindowSizeCallback(window, sizeCallback = new GLFWWindowSizeCallback() {
    @Override
    public void invoke(final long window, final int width, final int height) {
        updateSizes();
    }
});

Why would this be leaking?

Edit: Oops, just realized this is probably the root of the problem, and not necessary the leaks...

Exception in thread "LwjglVulkanContext" java.lang.IllegalStateException: The memory address specified is not being tracked: 0x75F193FFDC50
	at org.lwjgl.system.MemoryManage$DebugAllocator.untrackAbort(MemoryManage.java:327)
	at org.lwjgl.system.MemoryManage$DebugAllocator.untrack(MemoryManage.java:319)
	at org.lwjgl.system.MemoryManage$DebugAllocator.free(MemoryManage.java:240)
	at org.lwjgl.system.MemoryUtil.nmemFree(MemoryUtil.java:338)
	at org.lwjgl.system.Struct.free(Struct.java:65)
	at org.lwjgl.system.NativeResource.close(NativeResource.java:20)
	at com.jme3.vulkan.VulkanLogger.invoke(VulkanLogger.java:52)
	at org.lwjgl.vulkan.VkDebugUtilsMessengerCallbackEXTI.callback(VkDebugUtilsMessengerCallbackEXTI.java:58)
	at org.lwjgl.system.JNI.callPPPI(Native Method)
	at org.lwjgl.vulkan.VK10.nvkEnumeratePhysicalDevices(VK10.java:3046)
	at org.lwjgl.vulkan.VK10.vkEnumeratePhysicalDevices(VK10.java:3100)
	at com.jme3.vulkan.PhysicalDevice.lambda$getSuitableDevice$4(PhysicalDevice.java:140)
	at com.jme3.renderer.vulkan.VulkanUtils.enumerateBuffer(VulkanUtils.java:49)
	at com.jme3.vulkan.PhysicalDevice.getSuitableDevice(PhysicalDevice.java:139)
	at jme3test.vulkan.VulkanHelperTest.simpleInitApp(VulkanHelperTest.java:117)
	at com.jme3.app.SimpleApplication.initialize(SimpleApplication.java:240)
	at com.jme3.system.vulkan.LwjglVulkanContext.engineInitialize(LwjglVulkanContext.java:88)
	at com.jme3.system.vulkan.LwjglVulkanContext.run(LwjglVulkanContext.java:63)
	at java.base/java.lang.Thread.run(Thread.java:1575)

It refers to this in the application:

@Override
public int invoke(int messageSeverity, int messageTypes, long pCallbackData, long pUserData) {
    try (VkDebugUtilsMessengerCallbackDataEXT data = VkDebugUtilsMessengerCallbackDataEXT.create(pCallbackData)) {
        Level lvl = getLoggingLevel(messageSeverity);
        if (exceptionThreshold != null && lvl.intValue() >= exceptionThreshold.intValue()) {
            throw new RuntimeException(lvl.getName() + ": " + data.pMessageString());
        } else {
            System.err.println(lvl.getName() + "  " + data.pMessageString());
        }
    } // exception occurs here
    return VK_FALSE;
}

spasi

Yes, the pCallbackData struct is allocated and managed by the Vulkan implementation. By wrapping it in try-with-resources, it is getting freed from your side, then freed again from the driver side. A double-free is a serious bug that usually leads to crashes. The client-side free might crash as well, if the allocator used is different.

The callbacks themselves allocate native resources in LWJGL, which typically need to be released. Not releasing them on application exit is not necessarily problematic, however it's good practice to free them to reduce the debug allocator output, so that legit bugs are more visible.

codex

That makes sense. I'm creating the callback data struct for existing memory, not allocating memory for the struct. Getting rid of the try-with-resources fixed the exception and fixed the native crash. I also managed to patch all of the memory leaks that were reported (several of which had to do with unclosed callbacks).

Thank you so much for your help! :D