CL/GL interop demo on MacBook Pro

Started by hexatronic, May 21, 2017, 18:32:57

Previous topic - Next topic

hexatronic

Hi,

I'm trying to run the CL/GL interop demo on a MacBook Pro, based on LWJGL 3.1.1

As far as I can tell this tries to run the Mandelbrot computation / display on both the GPU and CPU if available.

The CPU side works, for the GPU I'm getting this:

[Apple - GPU] cl_context_callback
   Info: [CL_INVALID_DEVICE] : OpenCL Error : clCreateCommandQueue failed: Unable to locate device 0x1024500 in context 0x7fcb78e658a0.


My system details:
MacBook Pro (Retina, 15-inch, Mid 2014)
OSX Yosemite 10.10.1

The discrete GPU on my system is NVIDIA GeForce GT 750M 2048 MB.
But, I think it also has an integrated GPU which is the Intel Iris Pro. The demo seems to be trying to set up on the Iris, based on the ID mentioned in the error.

Just a hunch but I am wondering if confusion between which GPU is active might be the source of the problem.

The output of CLDemo, with all the info, is below.

Any idea how to move this forward?



-------------------------
NEW PLATFORM: [0x7FFF0000]
   CL_PLATFORM_PROFILE = FULL_PROFILE
   CL_PLATFORM_VERSION = OpenCL 1.2 (Sep 20 2014 22:01:02)
   CL_PLATFORM_NAME = Apple
   CL_PLATFORM_VENDOR = Apple
   CL_PLATFORM_EXTENSIONS = cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event


   ** NEW DEVICE: [0xFFFFFFFF]
   CL_DEVICE_TYPE = 2
   CL_DEVICE_VENDOR_ID = -1
   CL_DEVICE_MAX_COMPUTE_UNITS = 8
   CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS = 3
   CL_DEVICE_MAX_WORK_GROUP_SIZE = 1024
   CL_DEVICE_MAX_CLOCK_FREQUENCY = 2500
   CL_DEVICE_ADDRESS_BITS = 64
   CL_DEVICE_AVAILABLE = true
   CL_DEVICE_COMPILER_AVAILABLE = true
   CL_DEVICE_NAME = Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz
   CL_DEVICE_VENDOR = Intel
   CL_DRIVER_VERSION = 1.1
   CL_DEVICE_PROFILE = FULL_PROFILE
   CL_DEVICE_VERSION = OpenCL 1.2
   CL_DEVICE_EXTENSIONS = cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_APPLE_fp64_basic_ops cl_APPLE_fixed_alpha_channel_orders cl_APPLE_biased_fixed_point_image_formats cl_APPLE_command_queue_priority
   CL_DEVICE_OPENCL_C_VERSION = OpenCL C 1.2
      -TRYING TO EXEC NATIVE KERNEL-
      KERNEL EXEC argument: 1337, should be 1337
      Event callback status: CL_COMPLETE

      EMPTY NATIVE KERNEL AVG EXEC TIME: 13.6670us

      Sub Buffer destructed: 140654185474272
      Buffer destructed (2): 140654188521344
      Buffer destructed (1): 140654188521344

   ** NEW DEVICE: [0x1024500]
   CL_DEVICE_TYPE = 4
   CL_DEVICE_VENDOR_ID = 16925952
   CL_DEVICE_MAX_COMPUTE_UNITS = 40
   CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS = 3
   CL_DEVICE_MAX_WORK_GROUP_SIZE = 512
   CL_DEVICE_MAX_CLOCK_FREQUENCY = 1200
   CL_DEVICE_ADDRESS_BITS = 64
   CL_DEVICE_AVAILABLE = true
   CL_DEVICE_COMPILER_AVAILABLE = true
   CL_DEVICE_NAME = Iris Pro
   CL_DEVICE_VENDOR = Intel
   CL_DRIVER_VERSION = 1.2(Sep 25 2014 22:25:51)
   CL_DEVICE_PROFILE = FULL_PROFILE
   CL_DEVICE_VERSION = OpenCL 1.2
   CL_DEVICE_EXTENSIONS = cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_image2d_from_buffer cl_khr_gl_depth_images cl_khr_depth_images cl_khr_3d_image_writes cl_khr_gl_msaa_sharing
   CL_DEVICE_OPENCL_C_VERSION = OpenCL C 1.2

      Sub Buffer destructed: 140654185575952
      Buffer destructed (2): 140654185474656
      Buffer destructed (1): 140654185474656

   ** NEW DEVICE: [0x1022700]
   CL_DEVICE_TYPE = 4
   CL_DEVICE_VENDOR_ID = 16918272
   CL_DEVICE_MAX_COMPUTE_UNITS = 2
   CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS = 3
   CL_DEVICE_MAX_WORK_GROUP_SIZE = 1024
   CL_DEVICE_MAX_CLOCK_FREQUENCY = 925
   CL_DEVICE_ADDRESS_BITS = 32
   CL_DEVICE_AVAILABLE = true
   CL_DEVICE_COMPILER_AVAILABLE = true
   CL_DEVICE_NAME = GeForce GT 750M
   CL_DEVICE_VENDOR = NVIDIA
   CL_DRIVER_VERSION = 10.0.43 310.41.05f01
   CL_DEVICE_PROFILE = FULL_PROFILE
   CL_DEVICE_VERSION = OpenCL 1.2
   CL_DEVICE_EXTENSIONS = cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_APPLE_fp64_basic_ops cl_khr_fp64 cl_khr_3d_image_writes cl_khr_depth_images cl_khr_gl_depth_images cl_khr_gl_msaa_sharing cl_khr_image2d_from_buffer cl_APPLE_ycbcr_422 cl_APPLE_rgb_422
   CL_DEVICE_OPENCL_C_VERSION = OpenCL C 1.2

      Sub Buffer destructed: 140654187450480
      Buffer destructed (2): 140654187449776
      Buffer destructed (1): 140654187449776




spasi

Hey hexatronic,

Looks like it's simply a problem of the demo code that chooses OpenCL devices. I have tried it with multiple platforms/devices, but haven't seen 3 devices on the same platform. The demo currently always chooses the first two devices with OpenGL interop support, which happen to be the Intel CPU and the Intel GPU on your system. The code that chooses a device based on type is in the getDevice method, at Mandelbrot.java:462. It should be easy to change to make it choose the Nvidia device.

Modifying the demo so that it shows 3 windows with all 3 devices should also be possible, but probably harder.

hexatronic

Yes that worked.

I added this into the loop part of getDevice:

            String vendor = getDeviceInfoStringUTF8(device, CL_DEVICE_VENDOR);
            if(deviceType == CL_DEVICE_TYPE_GPU && !vendor.equals("NVIDIA")) {
               continue;
            }


In production I am guessing you would loop over the various GPUs which the host machine reported and try them in turn?

I am pretty sure the Intel Iris GPU is also perfectly capable of OpenCL / GL interop, it's just not working in my case because that GPU is not active.

Unless there's some way that the operating system can report which one is "active"?

spasi

A few thoughts:

- This sounds like a bug in Apple's OpenCL implementation.
- If the GPU device is inactive, then it is should not be listed as available in OpenCL.
- If the GPU device can be used for compute, but cannot be used for graphics interop, then it should be listed as available in OpenCL, but without the *_gl_* extensions.
- An application should indeed try the "best" available device and fallback to other device(s) if that one fails. "Best" means whatever is appropriate for the particular application, based on the exposed capabilities and characteristics of the device.
- Another thing you could try is: create a hidden/temporary GLFW window and query the GPU vendor of the created context. Use that vendor to select the corresponding OpenCL device.

hexatronic

Good ideas, thanks for your help.