OpenCL

Started by dronus, February 10, 2010, 14:46:50

Previous topic - Next topic

spasi

I just committed OpenCL support to LWJGL! For a quick preview, click here.

Features:

- Full OpenCL 1.0 and 1.1 support.
- All OpenCL extensions are supported.
- CL/GL interop support.
- Low-level binding (COMPLETE) + high-level API for convenience (WIP).

TODO/Known Issues:

- Need to build and test on MacOS.
- CL/GL interop is not supported in MacOS. Apparently it requires CGL and we don't use that in LWJGL.
- Need to add more high-level API, there are quite a few annoying methods that can be improved.
- It's beta quality really, needs a lot more testing. If you try it out and it breaks somewhere, or if you have a suggestion for an improvement, just let me know.
- The current codebase requires Java 1.5. Need to sort this out somehow.

Matzon

awesome work!

I will test on my available platforms asap and report back.

As for the 1.5 issue. I *really* dont have an issue with it - but there might be an issue with legacy mac users. Anyone have any numbers?

delt0r

I have had no problems with mac users and i force them to use java 1.6, while i get the odd email from window users. About half my users are on Macs, but this is for scientific code which could skew the stats somewhat.

I'm sure princec has some numbers.
If you want a plot read a book and leave Hollywood out of it.

spasi

Hudson is being weird. I've uploaded a build for people that can't build from svn:

edit:

Removed, you can download everything from Hudson now.

spasi

I tried to compile LWJGL with -g:none and lwjgl.jar drops from 819kb to 618kb, a good 25%. I can change the build script so that the debug jar is compiled with debug information and the normal jar without. Should I go for it or is it a bad idea?

ruben01

shouldn't that be a deployer optimization? like using pack200 or using proguard to remove unused classes.
I think the lwjgl library should have the java debug info included.

Matzon

Quote from: ruben01 on September 27, 2010, 18:15:43
shouldn't that be a deployer optimization? like using pack200 or using proguard to remove unused classes.
I think the lwjgl library should have the java debug info included.
agreed - we shouldn't be so scared about size of the java code, since it - will be - should be - obfuscated before deployment.

Matzon

Windows 7, 32 bit, AMD - OK

-------------------------
NEW PLATFORM: 75289612
OpenCL 1.1 - Extensions:
-------------------------
        CL_PLATFORM_PROFILE = FULL_PROFILE
        CL_PLATFORM_VERSION = OpenCL 1.1 ATI-Stream-v2.2 (302)
        CL_PLATFORM_NAME = ATI Stream
        CL_PLATFORM_VENDOR = Advanced Micro Devices, Inc.


        NEW DEVICE: 77719864
OpenCL 1.1 - Extensions: cl_amd_device_attribute_query cl_amd_fp64 cl_amd_printf cl_ext_device_fission cl_khr_byte_addressable_store cl_khr_gl_sharing cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics
        -------------------------
        CL_DEVICE_TYPE = 2
        CL_DEVICE_VENDOR_ID = 4098
        CL_DEVICE_MAX_COMPUTE_UNITS = 2
        CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS = 3
        CL_DEVICE_MAX_WORK_GROUP_SIZE = 1024
        CL_DEVICE_MAX_CLOCK_FREQUENCY = 1995
        CL_DEVICE_ADDRESS_BITS = 32
        CL_DEVICE_AVAILABLE = true
        CL_DEVICE_COMPILER_AVAILABLE = true
        CL_DEVICE_NAME = Intel(R) Core(TM)2 Duo CPU     T6400  @ 2.00GHz

        CL_DEVICE_VENDOR = GenuineIntel
        CL_DRIVER_VERSION = 2.0
        CL_DEVICE_PROFILE = FULL_PROFILE
        CL_DEVICE_VERSION = OpenCL 1.1 ATI-Stream-v2.2 (302)
        CL_DEVICE_OPENCL_C_VERSION = OpenCL C 1.1
-TRYING TO EXEC NATIVE KERNEL-
memobjs = 1
memobjs[0].remaining() = 128
Sub Buffer destructed: 77725336
SECOND Buffer destructed: 77725200
FIRST Buffer destructed: 77725200

        NEW DEVICE: 77720344
OpenCL 1.0 - Extensions: cl_amd_device_attribute_query cl_khr_gl_sharing
        -------------------------
        CL_DEVICE_TYPE = 4
        CL_DEVICE_VENDOR_ID = 4098
        CL_DEVICE_MAX_COMPUTE_UNITS = 8
        CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS = 3
        CL_DEVICE_MAX_WORK_GROUP_SIZE = 128
        CL_DEVICE_MAX_CLOCK_FREQUENCY = 550
        CL_DEVICE_ADDRESS_BITS = 32
        CL_DEVICE_AVAILABLE = true
        CL_DEVICE_COMPILER_AVAILABLE = true
        CL_DEVICE_NAME = ATI RV730

        CL_DEVICE_VENDOR = Advanced Micro Devices, Inc.
        CL_DRIVER_VERSION = CAL 1.4.792
        CL_DEVICE_PROFILE = FULL_PROFILE
        CL_DEVICE_VERSION = OpenCL 1.0 ATI-Stream-v2.2 (302)
        CL_DEVICE_OPENCL_C_VERSION = OpenCL C 1.0
Sub Buffer destructed: 77725336
SECOND Buffer destructed: 77725200
FIRST Buffer destructed: 77725200

spasi

Another update today, fixed a few bugs and the "high-level" API should be more or less complete now. Also ported the entire codebase to use Java 1.5 features.

Matzon

Are we ready to release 2.6 or should we wait a bit ?

princec

...2.6 is Java 5.0+ is it? That should be 3.0?

Cas :)

kappa

Quote from: princec on September 29, 2010, 10:46:43
...2.6 is Java 5.0+ is it? That should be 3.0?

Cas :)

oh, I thought we were saving 3.0 for major api breakage and new features but guess LWJGL 4.0 is still available :)

changes such as:

1) Moving the native window code out of the opengl package, possibly into org.lwjgl.display.*. Thus allowing way for stuff like OpenGL ES, DirectX, etc to be plugged in easily.

2) Multiple Window/Display support which would probably require api breakage.

3) Proper lightweight native window system for LWJGL on mac (Cocoa?) and not relying on AWT.

4) Proper alternative to Display.setParent and AWTGLCanvas (maybe some sort of merge).

5) Support for OpenGL ES.

etc (i'm sure theres a few others but can't recall atm).

but then again OpenGL 4.1 and OpenCL support is a pretty big addition.


oh btw just curious what about making the OpenCL stuff a standalone java 1.5 jar and native (thus keeping core LWJGL 2.x java 1.4 compatible until LWJGL 3.0)? probably not worth the effort though if the OpenCL stuff is already interweaved with LWJGL code.

spasi

Quote from: Matzon on September 29, 2010, 08:52:41Are we ready to release 2.6 or should we wait a bit ?

Doing a few tests atm, better wait a bit. In any case, I don't think the 2.6 release will be stable. The OpenCL implementation is barely tested and things will change as more people start experimenting with it and the vendors release better implementations (that may brake assumptions I've made while coding this binding). Unfortunately I don't have a proper app that I can throw at it for testing, like I can do with Marathon and OpenGL. But anyway, it doesn't really matter as long as the rest of the library works. People working with OpenCL will expect instabilities.

Quote from: kappa on September 29, 2010, 11:14:02oh, I thought we were saving 3.0 for major api breakage and new features but guess LWJGL 4.0 is still available :)

changes such as:

1) Moving the native window code out of the opengl package, possibly into org.lwjgl.display.*. Thus allowing way for stuff like OpenGL ES, DirectX, etc to be plugged in easily.

2) Multiple Window/Display support which would probably require api breakage.

3) Proper lightweight native window system for LWJGL on mac (Cocoa?) and not relying on AWT.

4) Proper alternative to Display.setParent and AWTGLCanvas (maybe some sort of merge).

5) Support for OpenGL ES.

etc (i'm sure theres a few others but can't recall atm).

but then again OpenGL 4.1 and OpenCL support is a pretty big addition.


oh btw just curious what about making the OpenCL stuff a standalone java 1.5 jar and native (thus keeping core LWJGL 2.x java 1.4 compatible until LWJGL 3.0)? probably not worth the effort though if the OpenCL stuff is already interweaved with LWJGL code.

I agree on everything. LWJGL 3.0 should be the major refactoring + API breakage. I tried to implement multiple windows sometime this summer, while trying to keep compatibility, it's a mess. If we're going to implement all of the above we'll need to redesign a lot of code.

About OpenCL, the overhead is minimal tbh, only ~100kb compared to 2.5 for lwjgl.jar. I don't think it's worth the pain of deploying an additional jar and native lib. Also, I've made a 2.5 branch on our SVN, in case we need to back-port an important 1.4-compatible fix.

Momoko_Fan

I tried running the HelloOpenCL demo:
-------------------------
NEW PLATFORM: 80292888
OpenCL 1.0 - Extensions: cl_khr_d3d10_sharing cl_khr_gl_sharing cl_khr_icd 
-------------------------
    CL_PLATFORM_PROFILE = FULL_PROFILE
    CL_PLATFORM_VERSION = OpenCL 1.0 CUDA 3.1.1
    CL_PLATFORM_NAME = NVIDIA CUDA
    CL_PLATFORM_VENDOR = NVIDIA Corporation
    CL_PLATFORM_EXTENSIONS = cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_d3d9_sharing cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll 


    NEW DEVICE: 80292992
OpenCL 1.0 - Extensions: cl_khr_byte_addressable_store cl_khr_gl_sharing cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics 
    -------------------------
    CL_DEVICE_TYPE = 4
    CL_DEVICE_VENDOR_ID = 4318
    CL_DEVICE_MAX_COMPUTE_UNITS = 2
    CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS = 3
    CL_DEVICE_MAX_WORK_GROUP_SIZE = 512
    CL_DEVICE_MAX_CLOCK_FREQUENCY = 800
    CL_DEVICE_ADDRESS_BITS = 32
    CL_DEVICE_AVAILABLE = true
    CL_DEVICE_COMPILER_AVAILABLE = true
    CL_DEVICE_NAME = GeForce 8400M GS
    CL_DEVICE_VENDOR = NVIDIA Corporation
    CL_DRIVER_VERSION = 258.96
    CL_DEVICE_PROFILE = FULL_PROFILE
    CL_DEVICE_VERSION = OpenCL 1.0 CUDA
Exception in thread "main" java.lang.IllegalArgumentException: Invalid parameter specified: 0x103D
        at org.lwjgl.opencl.InfoUtilAbstract.getSizeRet(InfoUtilAbstract.java:129)
        at org.lwjgl.opencl.InfoUtilAbstract.getInfoString(InfoUtilAbstract.java:114)
        at org.lwjgl.opencl.CLDevice.getInfoString(CLDevice.java:103)
        at org.lwjgl.test.opencl.HelloOpenCL.printDeviceInfo(HelloOpenCL.java:170)
        at org.lwjgl.test.opencl.HelloOpenCL.execute(HelloOpenCL.java:94)
        at org.lwjgl.test.opencl.HelloOpenCL.main(HelloOpenCL.java:183)

It works okay until the exception, not sure why its happening.

EDIT: The Mandelbrot demo runs at about 17 fps.

spasi

Tnx Momoko_Fan, it's fixed now. Could you try the fractal demo with the next build and let me know if you get the same fps? You should also get a message on whether or not it runs on the GPU.

Matzon, I think I'm done now, feel free to release 2.6 whenever you can.