OpenCL gl-shared resident buffer, kaputt.

basil · June 30, 2014, 15:52:34

just wondering if anybody got that to work.

I'm creating VBO in the fancy way of using ..

ptr = NVShaderBufferLoad.glGetNamedBufferParameterui64NV(id,NVShaderBufferLoad.GL_BUFFER_GPU_ADDRESS_NV);
NVShaderBufferLoad.glMakeNamedBufferResidentNV(id,GL_READ_ONLY);
NVVertexBufferUnifiedMemory together with the GPU address. and so on ...

.. what works super nice on GL4.3+ hardware.

now, what I cannot figure out is once the buffer is "resident", I cannot get it to work with openCL.
it's pretty confusing : while accessing the resident-buffer from CL is not working, running a GL compute shader over the same works fine. then, once I create the openCL buffer (clCreateFromGLBuffer()) the compute shader fails too. the funny part is, i can still draw the resident-buffer with glDrawElements etc.

i guess clCreateFromGLBuffer is not changing the gpu address, but what is it ? what am i missing

? or are we just stuck since openCL cannot handle direct gpu address pointers yet ?

((here is some more info about that resident/bindless buffers if you're interested : a good sample : https://github.com/Groovounet/ogl-samples/blob/master/samples/gl-420-primitive-bindless-nv.cpp, the extension : https://www.opengl.org/registry/specs/NV/vertex_buffer_unified_memory.txt, and since it is nvidia-only here is some more : https://developer.nvidia.com/content/bindless-graphics))

spasi · June 30, 2014, 17:32:03

Quote from: basil on June 30, 2014, 15:52:34or are we just stuck since openCL cannot handle direct gpu address pointers yet?

That's the most likely reason. It could probably work with CUDA though.

Btw, why do you want to go through OpenCL if GL compute works fine?

basil · June 30, 2014, 18:31:20

oh right .. that was another question that buggs me. short story :

I ported my java-r-tree holding triangles to a glsl compute shader to do per-vertex-ao, raytraced shadows, etc .. stuff that works fine on cpu. then, after overcoming the no-recursion constraint, i realized that running a compute shader task for longer then ... ~1 sec. is not going well with the driver (i guess). *edit* (here's a little example of pre-computed-vertex-ao on cpu if you'd like to see, takes ~5 sec. on 4 cores : http://memleaks.net/lws_shots/obj_jun.4-20.29.25_45234.png)

some calculation (depending on ray-count and elements processed and what not) take "some" time on the cpu. sometimes to a couple minutes. executing such task on the compute shader makes the driver think "your gpu is not responding" and resets

. is there a way to disable that ? i mean, it's ment to run for a while, not necessarily realtime. afaik opencl doesnt care how long it executes a kernel.

spasi · July 01, 2014, 09:19:49

That's "Timeout Detection and Recovery", a Windows WDDM feature that kicks-in when a rendering task takes too long. You've got two options: a) Disable TDR, there's a registry key that does it, iirc, or b) Split the compute task into shorter subtasks, such that TDR is not triggered.

basil · July 01, 2014, 13:07:24

thanks

http://msdn.microsoft.com/en-us/library/windows/hardware/ff569918%28v=vs.85%29.aspx

News:

OpenCL gl-shared resident buffer, kaputt.

basil

spasi

basil

spasi

basil