[Solved] OpenCL: JVM crash when retrieving program binaries

Started by shaman, May 24, 2016, 09:01:27

Previous topic - Next topic

shaman

Hi folks,

I try to retrieve the binaries of an OpenCL program, so that I do not have to recompile it every time.
While this works with LWJGL2, my code for LWJGL3 does not work:

long device, program; //Function inputs

int numDevices = Info.clGetProgramInfoInt(program, CL10.CL_PROGRAM_NUM_DEVICES);

PointerBuffer devices = PointerBuffer.allocateDirect(numDevices);
int ret = CL10.clGetProgramInfo(program, CL10.CL_PROGRAM_DEVICES, devices, null);
Utils.checkError(ret, "clGetProgramInfo: CL_PROGRAM_DEVICES");
int index = -1;
for (int i=0; i<numDevices; ++i) {
	if (devices.get(i) == device) {
		index = i;
	}
}
if (index == -1) {
	 throw new Exception("Program was not built against the specified device "+device);
}

PointerBuffer sizes = PointerBuffer.allocateDirect(numDevices);
ret = CL10.clGetProgramInfo(program, CL10.CL_PROGRAM_BINARY_SIZES, sizes, null);
Utils.checkError(ret, "clGetProgramInfo: CL_PROGRAM_BINARY_SIZES");
int size = (int) sizes.get(index);

PointerBuffer binaryPointers = PointerBuffer.allocateDirect(numDevices * 8);
for (int i=0; i<binaryPointers.capacity(); ++i) {
	binaryPointers.put(0L);
}
binaryPointers.rewind();
ByteBuffer binaries = ByteBuffer.allocateDirect(size);
binaryPointers.put(index, binaries);

//Fixme: why the hell does this line throw a segfault ?!?
ret = CL10.clGetProgramInfo(program, CL10.CL_PROGRAM_BINARIES, binaryPointers, null);
Utils.checkError(ret, "clGetProgramInfo: CL_PROGRAM_BINARIES");

return binaries;


Has someone an idea why this code results in a crash of the JVM?

Thanks

Kai

The copied code snippet seems to be broken. If the rest of the code snippet did not indicate the exact method call that crashed (was it CL10.clGetProgramInfo(program, CL10.CL_PROGRAM_DEVICES, devices, null) ?), then please tell which exact method call with which exact arguments crashed.
Also please attach the crash log (the hs_err_pidXXXX.log) to the post.

spasi

The Info class has been removed in LWJGL 3 (in 3.0.0 nightly builds at least). Try this:

int ret;
PointerBuffer devices;
try ( MemoryStack stack = stackPush() ) {
	IntBuffer ip = stack.mallocInt(1);
	ret = clGetProgramInfo(program, CL_PROGRAM_NUM_DEVICES, ip, null);
	Utils.checkError(ret, "clGetProgramInfo: CL_PROGRAM_NUM_DEVICES");

	devices = BufferUtils.createPointerBuffer(ip.get(0));
	ret = clGetProgramInfo(program, CL_PROGRAM_DEVICES, devices, null);
	Utils.checkError(ret, "clGetProgramInfo: CL_PROGRAM_DEVICES");
}

shaman

Why is the whole code not shown? Last time I checked, it was still there.
Anyway, here is the code again:

        long device, program; //Input
        int numDevices = Info.clGetProgramInfoInt(program, CL10.CL_PROGRAM_NUM_DEVICES);
       
        PointerBuffer devices = PointerBuffer.allocateDirect(numDevices);
        int ret = CL10.clGetProgramInfo(program, CL10.CL_PROGRAM_DEVICES, devices, null);
        Utils.checkError(ret, "clGetProgramInfo: CL_PROGRAM_DEVICES");
        int index = -1;
        for (int i=0; i<numDevices; ++i) {
            if (devices.get(i) == device) {
                index = i;
            }
        }
        if (index == -1) {
             throw new Exception("Program was not built against the specified device "+device);
        }
       
        PointerBuffer sizes = PointerBuffer.allocateDirect(numDevices);
        ret = CL10.clGetProgramInfo(program, CL10.CL_PROGRAM_BINARY_SIZES, sizes, null);
        Utils.checkError(ret, "clGetProgramInfo: CL_PROGRAM_BINARY_SIZES");
        int size = (int) sizes.get(index);
       
        PointerBuffer binaryPointers = PointerBuffer.allocateDirect(numDevices * 8 );
        for (int i=0; i<binaryPointers.capacity(); ++i) {
            binaryPointers.put(0L);
        }
        binaryPointers.rewind();
        ByteBuffer binaries = ByteBuffer.allocateDirect(size);
        binaryPointers.put(index, binaries);
       
        //Fixme: The JVM crashes in this line!
        ret = CL10.clGetProgramInfo(program, CL10.CL_PROGRAM_BINARIES, binaryPointers, null);
        Utils.checkError(ret, "clGetProgramInfo: CL_PROGRAM_BINARIES");
       
        return binaries;

Kai

1. you create a too large 'binaryPointers' PointerBuffer. It should not contain `numDevices * 8` pointers, but instead just `numDevices` pointers. This leads to LWJGL specifying a too large `param_value_size` argument to the native `clGetProgramInfo` function. So remove the multiply by 8.
2. You only seem to query the program binary for the device with index 'index', by setting all other pointers to NULL. I am not sure whether this actually works. Instead I think you have to allocate binary buffers for all program devices and put all of their addresses into `binaryPointers`. (<- this is wrong, see Spasi's answer below)

spasi

Quote from: Kai on May 25, 2016, 08:47:30
1. you create a too large 'binaryPointers' PointerBuffer. It should not contain `numDevices * 8` pointers, but instead just `numDevices` pointers. This leads to LWJGL specifying a too large `param_value_size` argument to the native `clGetProgramInfo` function. So remove the multiply by 8.

You also don't need to fill the array with NULLs. Memory allocated with allocateDirect is guaranteed to be zero filled.

Quote from: Kai on May 25, 2016, 08:47:302. You only seem to query the program binary for the device with index 'index', by setting all other pointers to NULL. I am not sure whether this actually works. Instead I think you have to allocate binary buffers for all program devices and put all of their addresses into `binaryPointers`.

From the spec:

QuoteIf an entry value in the array is NULL, the implementation skips copying the program binary for the specific device identified by the array index.

I tried the above code and cannot reproduce the crash. What platform are you running on?

shaman