To move the discussion here where it belongs from "nVIDIA's Cg" thread:
We're to determine a good solution to the int pointers -> buffers convertion problem. My personal favorite right now is to use ByteBuffers directly and fetching the address from C native code. To get insight on just how costly this is, I created a test, using glVertexPointer as my testing point. It will not compile other places than here, because I added a buffer version of glVertexPointer to my private library.
Here's the prototype for the new glVertexPointer:
public void vertexPointer2(int size, int type, int stride, ByteBuffer buffer);
Here's the code:
package org.lwjgl.test;
import org.lwjgl.*;
import org.lwjgl.opengl.GL;
import java.nio.*;
/**
* @author Elias Naur
*/
public class NativeCallTimingTest {
private final static int WARMUP_ITERATIONS = 5;
private final static int ITERATIONS = 10000000;
public static void main(String[] args) {
GL gl = null;
try {
gl = new GL("WindowCreationTest", 50, 50, 320, 240, 16, 0, 0, 0);
gl.create();
} catch (Exception e) {
e.printStackTrace();
}
System.out.println("Display created");
gl.tick();
long time_taken_nounpack = 0;
long time_taken_unpack = 0;
for (int j = 0; j < WARMUP_ITERATIONS; j++) {
long before;
long after;
ByteBuffer buffer = ByteBuffer.allocateDirect(4096).order(ByteOrder.nativeOrder());
int address = Sys.getDirectBufferAddress(buffer);
before = System.currentTimeMillis();
for (int i = 0; i < ITERATIONS; i++)
gl.vertexPointer(4, GL.FLOAT, 0, address);
after = System.currentTimeMillis();
time_taken_nounpack = (after - before);
before = System.currentTimeMillis();
for (int i = 0; i < ITERATIONS; i++)
gl.vertexPointer2(4, GL.FLOAT, 0, buffer);
after = System.currentTimeMillis();
time_taken_unpack = (after - before);
System.out.println("No unpack, time taken: " + time_taken_nounpack + " millis");
System.out.println("With unpack, time taken: " + time_taken_unpack + " millis");
}
double ratio = (double)time_taken_unpack/time_taken_nounpack;
System.out.println("FINAL: No unpack, time taken: " + time_taken_nounpack + " millis");
System.out.println("FINAL: With unpack, time taken: " + time_taken_unpack + " millis");
System.out.println("FINAL: Ratio unpack/nounpack: " + ratio);
gl.destroy();
}
}
And here's the result:
[elias@ip172 tmp]$ java -Djava.library.path=. -cp lwjgl.jar:lwjgl_test.jar org.lwjgl.test.NativeCallTimingTest
Display created
No unpack, time taken: 2979 millis
With unpack, time taken: 7154 millis
No unpack, time taken: 2970 millis
With unpack, time taken: 7075 millis
No unpack, time taken: 2927 millis
With unpack, time taken: 7070 millis
No unpack, time taken: 2909 millis
With unpack, time taken: 7152 millis
No unpack, time taken: 2893 millis
With unpack, time taken: 7126 millis
FINAL: No unpack, time taken: 2893 millis
FINAL: With unpack, time taken: 7126 millis
FINAL: Ratio unpack/nounpack: 2.4631870031109573
So a rough 2.5 times increase in time taken per call with one buffer.
- elias