LWJGL 3.1.1 - OpenCL Example - Doesn't execute kernel

Started by officialhopsof, May 17, 2017, 16:41:08

Previous topic - Next topic

officialhopsof

I am attempting to pull some Java code into OpenCL to perform some engineering calculations that takes a few hours to run on the CPU.
Overall documentation for using OpenCL in LWJGL 3.1.1 is horrible at best. The link in the wiki to the OpenCL page just redirects to the main Wiki page, leaving the only resource to go by being the examples which have no explanation of what they are doing.

So, I wanted to start by running the Example provided by LWJGL here:
https://github.com/LWJGL/lwjgl3/blob/master/modules/core/src/test/java/org/lwjgl/demo/opencl/CLDemo.java

A little past halfway down I get to these lines

                    destructorLatch = null;
                }

                long exec_caps = getDeviceInfoLong(device, CL_DEVICE_EXECUTION_CAPABILITIES);
                if ((exec_caps & CL_EXEC_NATIVE_KERNEL) == CL_EXEC_NATIVE_KERNEL) {
                    System.out.println("\t\t-TRYING TO EXEC NATIVE KERNEL-");
                    long queue = clCreateCommandQueue(context, device, NULL, errcode_ret);

                    PointerBuffer ev = BufferUtils.createPointerBuffer(1);

                    ByteBuffer kernelArgs = BufferUtils.createByteBuffer(4);


that conditional fails.
exec_caps = 1
CL_EXEC_NATIVE_KERNEL = 2
so
(1 & 2) != 2

A little further down it looks like the kernel is created (I think), but this code is never executed since that conditional fails. The output of the example on my system is:

Quote

-------------------------
NEW PLATFORM: [0x54A6A0]
   CL_PLATFORM_PROFILE = FULL_PROFILE
   CL_PLATFORM_VERSION = OpenCL 1.2 CUDA 8.0.0
   CL_PLATFORM_NAME = NVIDIA CUDA
   CL_PLATFORM_VENDOR = NVIDIA Corporation
   CL_PLATFORM_EXTENSIONS = cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_copy_opts cl_nv_create_buffer
   CL_PLATFORM_ICD_SUFFIX_KHR = NV


   ** NEW DEVICE: [0x54A380]
   CL_DEVICE_TYPE = 4
   CL_DEVICE_VENDOR_ID = 4318
   CL_DEVICE_MAX_COMPUTE_UNITS = 28
   CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS = 3
   CL_DEVICE_MAX_WORK_GROUP_SIZE = 1024
   CL_DEVICE_MAX_CLOCK_FREQUENCY = 1582
   CL_DEVICE_ADDRESS_BITS = 64
   CL_DEVICE_AVAILABLE = true
   CL_DEVICE_COMPILER_AVAILABLE = true
   CL_DEVICE_NAME = GeForce GTX 1080 Ti
   CL_DEVICE_VENDOR = NVIDIA Corporation
   CL_DRIVER_VERSION = 381.89
   CL_DEVICE_PROFILE = FULL_PROFILE
   CL_DEVICE_VERSION = OpenCL 1.2 CUDA
   CL_DEVICE_EXTENSIONS = cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_copy_opts cl_nv_create_buffer
   CL_DEVICE_OPENCL_C_VERSION = OpenCL C 1.2

      Sub Buffer destructed: 772431360
      Buffer destructed (2): 772439520
      Buffer destructed (1): 772439520


Any ideas why that conditional is failing, not allowing me to create and execute a kernel in the example?

spasi

Quote from: officialhopsof on May 17, 2017, 16:41:08Overall documentation for using OpenCL in LWJGL 3.1.1 is horrible at best.

It is outside the scope of LWJGL to teach you how to use the binding APIs. There are plenty of resources and tutorials available elsewhere. The only issue is how to map C/C++ code to Java and LWJGL. The wiki has enough information about that and it simply takes a bit of practice to get used to it.

For OpenCL in particular, the only LWJGL-specific API that exists is CL.createPlatformCapabilities and CL.createDeviceCapabilities. They have sufficient javadoc and it should be obvious what they do, but do ask if you're not sure about something.

Quote from: officialhopsof on May 17, 2017, 16:41:08The link in the wiki to the OpenCL page just redirects to the main Wiki page, leaving the only resource to go by being the examples which have no explanation of what they are doing.

LWJGL cannot cover every API or every feature of a particular API with extensive documentation and code samples. It already comes with a ton of javadoc and links to external resources. Any other material that exists has been contributed by users for other users to have an easier time getting started. OpenCL is not very popular and it's normal that there's not much content available. You're welcome to contribute more if you'd like.

Quote from: officialhopsof on May 17, 2017, 16:41:08Any ideas why that conditional is failing, not allowing me to create and execute a kernel in the example?

CLDemo is a quick test to see what OpenCL platforms/devices are available and their capabilities. It is not useful as a getting started tutorial. See CLGLInteropDemo for an application that actually does something interesting with OpenCL.

With that said, the CL_EXEC_NATIVE_KERNEL check fails because native kernel execution is not available on Nvidia GPUs. Note that a native kernel is not the same thing as a standard OpenCL kernel. Specifically, clEnqueueNativeKernel "enqueues a command to execute a native C/C++ function not compiled using the OpenCL compiler". It's an optional OpenCL feature, typically available on CPU devices only. You can try CLDemo with Intel's or AMD's OpenCL runtimes to see it in action.

officialhopsof

Thanks for the help! Sorry for all my moaning, I have never used OpenCL before and I figured the API was LWJGL specific, hence my frustration with the documentation. In any case, the CLGLInteropDemo doesn't seem to work under LWJGL 3.1.1. The Whole org.lwjgl.glfw package appears to be missing (along with the classes that package contains).

spasi

LWJGL is modular, each binding comes as a separate artifact. CLGLInteropDemo requires the LWJGL core (all modules depend on it) and the OpenCL, OpenGL and GLFW bindings. OpenCL normally does not require OpenGL or GLFW, but this particular demo showcases interoperability between OpenCL and OpenGL (a fractal computed with OpenCL and rendered with OpenGL in a GLFW window).

The build configurator is an easy way to explore what's available and download exactly what you need. You may also want to upgrade to LWJGL 3.1.2 that was released 2 days ago.