Random JVM crashes when using glDrawElements with indexed VBO [SOLVED]

Started by Liosan, July 25, 2012, 20:20:43

Previous topic - Next topic

Liosan

Hello folks, I'm a first time poster here, so hello everybody :)

We are developing a 3D game using Java 1.6 with lwjgl 2.8.3. We coded 3D terrain rendering from a non-indexed VBO, and everything works fine. We then went onto coding static mesh rendering using a custom model format, and it works... kind of. Sometimes.

Loading the models into memory works fine, trouble starts when we try to display it. Quite often (not always) JVM crashes on us, somewhere in native driver code. This happens on two different Windows 7 64b computers and one Linux 32b computer. The crash on Linux provided us with a stacktrace, pointing to a glDrawElements() call; on Windows I checked that the segfault is due to reading an invalid location. When the game doesn't crash, the models look almost OK (the UVs are kinda mixed up, I'm not sure if it's because of the free model we used or our blender exporter or something else, but generally they look acceptable). The problem seems to be more frequent if we load more models into memory, or if we used more vertices, but it's hard to be sure.

I'll post some code; first, loading the model:
private Mesh loadMesh(Scanner scanner) throws Exception
	{
		Mesh mesh = new Mesh();
		if (!scanner.next().equals("MESH"))
			throw new Exception("Incorrect format: 'MESH' header hasn't been met");
		mesh.name = scanner.next();
		mesh.arrays = scanner.nextInt();
		mesh.verticesCnt = scanner.nextInt();
		mesh.indicesCnt = scanner.nextInt();
		boolean vertex = (mesh.arrays & RenderStateMachine.VERTEX) != 0;
		boolean normal = (mesh.arrays & RenderStateMachine.NORMAL) != 0;
		boolean texcoord = (mesh.arrays & RenderStateMachine.TEXCOORD) != 0;
		
		int vertexSize = 0;
		if (vertex)		{ mesh.vertexOffset = vertexSize * mesh.verticesCnt * 4;	vertexSize += 3; }
		if (normal)		{ mesh.normalOffset = vertexSize * mesh.verticesCnt * 4;  	vertexSize += 3; }
		if (texcoord)	{ mesh.texcoordOffset = vertexSize * mesh.verticesCnt * 4; 	vertexSize += 2; }
		FloatBuffer buffer = BufferUtils.createFloatBuffer(mesh.verticesCnt * vertexSize);
		
		if (vertex)
		{
			if (!scanner.next().equals("VERT"))
				throw new Exception("Incorrect format: 'VERT' header hasn't been met");
			for (int i = 0; i < mesh.verticesCnt * 3; i++)
			{
				float f = scanner.nextFloat();
				buffer.put(f);
			}
		}
		
		if (normal)
		{
			if (!scanner.next().equals("NORM"))
				throw new Exception("Incorrect format: 'NORM' header hasn't been met");
			for (int i = 0; i < mesh.verticesCnt * 3; i++)
				buffer.put(scanner.nextFloat());
		}
		
		if (texcoord)
		{
			if (!scanner.next().equals("TEXC"))
				throw new Exception("Incorrect format: 'TEXC' header hasn't been met");
			for (int i = 0; i < mesh.verticesCnt * 2; i++)
				buffer.put(scanner.nextFloat());
		}
		
		if (!scanner.next().equals("TRIA"))
			throw new Exception("Incorrect format: 'TRIA' header hasn't been met");
		IntBuffer indicesBuffer = BufferUtils.createIntBuffer(mesh.indicesCnt);
		for (int i = 0; i < mesh.indicesCnt; i++)
			indicesBuffer.put(scanner.nextInt());
		
		/* Generate and fill VBO */
		buffer.rewind();
		indicesBuffer.rewind();
		mesh.verticesVboID = GL15.glGenBuffers();
		mesh.indicesVboID = GL15.glGenBuffers();
		GL15.glBindBuffer(GL15.GL_ARRAY_BUFFER, mesh.verticesVboID);
		GL15.glBufferData(GL15.GL_ARRAY_BUFFER, buffer, GL15.GL_STATIC_DRAW);
		GL15.glBindBuffer(GL15.GL_ELEMENT_ARRAY_BUFFER, mesh.indicesVboID);
		GL15.glBufferData(GL15.GL_ELEMENT_ARRAY_BUFFER, indicesBuffer, GL15.GL_STATIC_DRAW);
		
		return mesh;
	}

The above is for a single mesh; a model can have multiple meshes, but I've seen even the simplest single-mesh models crash the JVM. The VBO is not interleaved.

Now, display:
public void render(ShaderProgramInfo program, RenderStateMachine rsm, int meshIndex)
	{
		assert(meshIndex < getMeshesCnt());
		Mesh mesh = meshes[meshIndex];
		GL15.glBindBuffer(GL15.GL_ARRAY_BUFFER, mesh.verticesVboID);
		GL15.glBindBuffer(GL15.GL_ELEMENT_ARRAY_BUFFER, mesh.indicesVboID);
		//rsm.setActiveArrays(mesh.arrays);
		if ((mesh.arrays & RenderStateMachine.VERTEX) != 0)
		{
			int posLoc = program.glGetAttribLocation("inPosition");
			GL20.glEnableVertexAttribArray(posLoc);
			GL20.glVertexAttribPointer(posLoc, 3, GL11.GL_FLOAT, false, 0, mesh.vertexOffset);
		}
		
		if ((mesh.arrays & RenderStateMachine.NORMAL) != 0)
		{
			int normalLoc = program.glGetAttribLocation("inNormal");
			GL20.glEnableVertexAttribArray(normalLoc);
			GL20.glVertexAttribPointer(normalLoc, 3, GL11.GL_FLOAT, false, 0, mesh.normalOffset);
		}
		
		if ((mesh.arrays & RenderStateMachine.TEXCOORD) != 0)
		{
			int tcLoc = program.glGetAttribLocation("inTexCoord");
			GL20.glEnableVertexAttribArray(tcLoc);
			GL20.glVertexAttribPointer(tcLoc, 2, GL11.GL_FLOAT, false, 0, mesh.texcoordOffset);
		}
		
		GL11.glDrawElements(GL11.GL_TRIANGLES, mesh.indicesCnt, GL11.GL_UNSIGNED_INT, 0);
		GL15.glBindBuffer(GL15.GL_ARRAY_BUFFER, 0);
		GL15.glBindBuffer(GL15.GL_ELEMENT_ARRAY_BUFFER, 0);
	}

The meshes we are using have VERTEX+NORMAL+TEXCOORD set (I checked using prints :)) - all 3 components.
I think that's all the relevant code... I can post the shader that we use, but I doubt a shader would cause a hard crash.

So, uh... does anyone have any ideas  ??? ??? I don't even know any tools or resources to start debugging this problem.

The obvious thing is we are trying to render more than we allocated, or that the various VBO components don't match. But I tripple-checked the math, and sometimes it DOES work, so...

EDIT: last-minute experiment - if we replace GL11.glDrawElements with glDrawRangeElements(GL_TRIANGLES, 0, 100, ...), then we get no crash. Some models look plausible, some look terrible. But JVM is stable.

Liosan

Fool Running

Nothing looks immediately obvious to me. Have you checked for OpenGL errors (Util.checkGLError)? I assume the program is bound before the call to render(). Are you sure the program is giving you back the indexes you expect? Have you tried disabling rendering with textures and/or normals to zero-in on the problem?
Maybe make sure that you aren't getting a negative index when being read going from a signed int to an unsigned (that would be weird since it sometimes works, though).
Programmers will, one day, rule the world... and the world won't notice until its too late.Just testing the marquee option ;D

Liosan

Hey, thanks for the interest :-)
Quote from: Fool Running on July 26, 2012, 13:06:10
Have you checked for OpenGL errors (Util.checkGLError)?
Yes, multiple times per frame. I added some extra ones, to no avail.
Quote from: Fool Running on July 26, 2012, 13:06:10
I assume the program is bound before the call to render().
Yes.
Quote from: Fool Running on July 26, 2012, 13:06:10
Are you sure the program is giving you back the indexes you expect?
I don't understand; what do you mean? How does the shader program 'give back' indexes?
Quote from: Fool Running on July 26, 2012, 13:06:10
Have you tried disabling rendering with textures and/or normals to zero-in on the problem?
Yes, doesn't help. Disabling indices helps :P (as in, using an index VBO with index count 0).
Quote from: Fool Running on July 26, 2012, 13:06:10
Maybe make sure that you aren't getting a negative index when being read going from a signed int to an unsigned (that would be weird since it sometimes works, though).
Maximum index is always verticesCnt-1, minimum index is 0. Seems fine.

Does it matter that the mesh loading is initiated in Jython code? I can't think why it could matter, but...
EDIT: we checked, initiating loading in pure Java code doesn't change a thing.
We are also wondering, is there some sort of limit to the size of FloatBuffers or IntBuffers? That's one major thing we don't understand in this code, how the buffers are allocated and how they are used to communicate with native code.
EDIT2: We also had a hypothesis that the GC is deallocating our FloatBuffers (note that after the call to glBufferData they are not referenced anymore), and that this is somehow a problem. But... we checked this too and storing a permanent reference to both buffers doesn't help a bit.

Liosan

Fool Running

Quote from: Liosan on July 26, 2012, 18:30:28
Quote from: Fool Running on July 26, 2012, 13:06:10
Are you sure the program is giving you back the indexes you expect?
I don't understand; what do you mean? How does the shader program 'give back' indexes?

Sorry, I meant is the call to program.glGetAttribLocation() giving back the values you would expect? Again, I would be surprised if they were wrong since it sometimes works and you aren't getting any errors, but you never know.  ;)

Maybe the shaders are important? Could you post them? Have you tried with really simple shaders (just set the location and a solid color)? Again, really unlikely, I think.

Last thing I can think of... What graphics cards are on the machines that are failing? Do you have any machines where it works? Are the drivers up-to-date?

Other then that, I give up.  :P
Programmers will, one day, rule the world... and the world won't notice until its too late.Just testing the marquee option ;D

Liosan

Quote from: Fool Running on July 27, 2012, 12:57:41
Sorry, I meant is the call to program.glGetAttribLocation() giving back the values you would expect? Again, I would be surprised if they were wrong since it sometimes works and you aren't getting any errors, but you never know.  ;)
posLoc, normalLoc and tcLoc are equal 0, 1 and 2 respectively.

Quote from: Fool Running on July 27, 2012, 12:57:41
Maybe the shaders are important? Could you post them? Have you tried with really simple shaders (just set the location and a solid color)? Again, really unlikely, I think.
I haven't tested simpler shaders; I'll try later on. Here is the current code, as you can see it's kindof debugish and there's an FBO hooked up for post-processing effects.
staticmodel.vert:
attribute vec4 gl_MultiTexCoord0;

in vec4 inPosition;
in vec3 inNormal;
in vec2 inTexCoord;

varying vec2 texCoords;
varying vec3 worldCoords;

void main(){
    gl_Position = gl_ModelViewProjectionMatrix*inPosition.xzyw;
    texCoords = inTexCoord;
    worldCoords = inPosition.xyz + inNormal * 0.01;
}

staticmodel.frag:
#version 150
#extension GL_EXT_gpu_shader4 : enable
#extension GL_ARB_explicit_attrib_location : enable

uniform sampler2D colorTex;

in vec2 texCoords;
in vec3 worldCoords;

layout(location = 0) out vec4 outColor;
layout(location = 1) out vec4 outTileCoords;

void main(){
	outColor = texture2D(colorTex, texCoords);
	if (outColor.a < 0.1) discard;
	outTileCoords = vec4(1.0, 0.0, worldCoords.z / 100.0, 1.0);
}


Quote from: Fool Running on July 27, 2012, 12:57:41
Last thing I can think of... What graphics cards are on the machines that are failing? Do you have any machines where it works? Are the drivers up-to-date?
My Linux setup uses twin GeForce 8400GS cards with 512 MB memory, with reported nVidia driver version 295.40. The newest ones available are 295.59, but there's no chance I'm updateing any time soon ;) I'll post the specs of the Windows machines later on.
EDIT: tested also on a Windows XP machine, with an nVidia geForce 8600GT. Also crash.

Just for your information, it currently crashes 100% of the time if I use a large model (50k vertices) and crashes sometimes with a smaller model (1.5k vertices). Terrain chunks have ~3k vertices each (unindexed), and don't cause a crash.

Liosan

Fool Running

Well, unfortunately, I'm out of ideas. Nothing looks suspicious to me. :-\

I'll let someone else take a stab.  ;D
Programmers will, one day, rule the world... and the world won't notice until its too late.Just testing the marquee option ;D

spasi

Hey Liosan,

A few random things you can try:

- Try the ARB_debug_output extension (you'll need to use .withDebug(true) on ContextAttribs for it to be available)
- Try to convert your indices to unsigned shorts, if the meshes are small enough (65536 vertices max).
- Try to use DrawRangeElements, like so: glDrawRangeElements(GL_TRIANGLES, 0, mesh.maxIndex, mesh.indicesCnt, GL_UNSIGNED_SHORT, 0);
- Try to not use a VBO for the indices, like so: glDrawElements(GL_TRIANGLES, indicesBuffer);

Liosan

Oooh, new ideas :) They don't really look like solutions but they sure may help with debugging.
Quote from: spasi on July 28, 2012, 09:29:22
- Try the ARB_debug_output extension (you'll need to use .withDebug(true) on ContextAttribs for it to be available)
So, how do I use that? I tried to google an example, but couldn't. Do you have a link or something? I used new ContextAttribs().withDebug(true), what now?

Quote from: spasi on July 28, 2012, 09:29:22
- Try to convert your indices to unsigned shorts, if the meshes are small enough (65536 vertices max).
Java has unsigned shorts? My largest model currently have 50k indices, which don't fit in a signed short (and such is used in a ShortBuffer, no?).

Quote from: spasi on July 28, 2012, 09:29:22
- Try to use DrawRangeElements, like so: glDrawRangeElements(GL_TRIANGLES, 0, mesh.maxIndex, mesh.indicesCnt, GL_UNSIGNED_SHORT, 0);
No difference, still crash.

Quote from: spasi on July 28, 2012, 09:29:22
- Try to not use a VBO for the indices, like so: glDrawElements(GL_TRIANGLES, indicesBuffer);
It actually worked for 10s, I saw the big model rendered. Creating a second "instance" crashed the JVM, and I tried to more times and I'm still getting crashes. Means I just got lucky when it worked for the first time, eh... So, in short, no difference, still crash.

EDIT: I keep on suspecting the GC, because that's something that can be random and it would be natural that it gets worse with larger models, but... I can't pin it down. I don't see any place I could have made an error that the GC would screw up. I tried increasing java heap size, and it I played for a while without crashes... but when I tried to run the game a second time I get crashes right at the start. I get a report from the crash, and it looks something like this:
QuoteHeap
PSYoungGen      total 131200K, used 49298K [0xabcb0000, 0xb4600000, 0xb4600000)
  eden space 126656K, 35% used [0xabcb0000,0xae866ba8,0xb3860000)
  from space 4544K, 99% used [0xb4190000,0xb45fdf40,0xb4600000)
  to   space 6976K, 0% used [0xb3860000,0xb3860000,0xb3f30000)
PSOldGen        total 18112K, used 10877K [0x67200000, 0x683b0000, 0xabcb0000)
  object space 18112K, 60% used [0x67200000,0x67c9f590,0x683b0000)
PSPermGen       total 27264K, used 21385K [0x63200000, 0x64ca0000, 0x67200000)
  object space 27264K, 78% used [0x63200000,0x646e2740,0x64ca0000)
In every crash, from space is '99% used', but PSYoungGen used is nowhere near total. I have no idea what this means ::) Is this a sign of a possible deallocation issue?

Liosan

spasi

Quote from: Liosan on July 28, 2012, 09:50:54
Quote from: spasi on July 28, 2012, 09:29:22
- Try the ARB_debug_output extension (you'll need to use .withDebug(true) on ContextAttribs for it to be available)
So, how do I use that? I tried to google an example, but couldn't. Do you have a link or something? I used new ContextAttribs().withDebug(true), what now?

You need to run this:

if ( GLContext.getCapabilities().GL_ARB_debug_output )
    ARBDebugOutput.glDebugMessageCallbackARB(new ARBDebugOutputCallback());


You can specify your own handler in the ARBDebugOutputCallback constructor, but the default should be fine.

Quote from: Liosan on July 28, 2012, 09:50:54Java has unsigned shorts? My largest model currently have 50k indices, which don't fit in a signed short (and such is used in a ShortBuffer, no?).

OpenGL rendering is language/platform neutral, it doesn't matter that Java doesn't have support for unsigned shorts. That's why LWJGL uses direct buffers for passing data to the GL driver, it's just native memory and you can put anything you want in there. It's very easy:

int[] indices = ...
ByteBuffer indicesBuffer = ...
// Encode as ushort
short indexOut = (short)indices[0];
indicesBuffer.putShort(0, index);
// Decode as int
int indexIn = indicesBuffer.getShort(0) & 0xFFFF;


Encoding is a simple cast, decoding (say, if you want to validate the indices written) is a binary operation.

Anyway, I'd be very surprised if it's a GC issue. Usually crashes like these come from bad VBO data, most likely bad indices. It would certainly help if you could come up with a simpler test that reproduces the issue, something that you could post here for others to run.

Liosan

Quote from: spasi on July 28, 2012, 15:02:17
Quote from: Liosan on July 28, 2012, 09:50:54
Quote from: spasi on July 28, 2012, 09:29:22
- Try the ARB_debug_output extension (you'll need to use .withDebug(true) on ContextAttribs for it to be available)
So, how do I use that? I tried to google an example, but couldn't. Do you have a link or something? I used new ContextAttribs().withDebug(true), what now?

You need to run this:

if ( GLContext.getCapabilities().GL_ARB_debug_output )
    ARBDebugOutput.glDebugMessageCallbackARB(new ARBDebugOutputCallback());

OK, hooked it up, no output :(

Quote from: spasi on July 28, 2012, 15:02:17
int[] indices = ...
ByteBuffer indicesBuffer = ...
// Encode as ushort
short indexOut = (short)indices[0];
indicesBuffer.putShort(0, index);
// Decode as int
int indexIn = indicesBuffer.getShort(0) & 0xFFFF;

Ok, the ByteBuffer.putShort method looks promising, but the code above still requires a convertion, specifically for numbers within the 32k-64k range into a 'short' variable.
Anyways, I tested it with a small model, and using the code above doesn't help :(

Quote from: spasi on July 28, 2012, 15:02:17
Usually crashes like these come from bad VBO data, most likely bad indices.
Well, that sounds reasonable... the model is imported from a custom exporter, so there sure might be errors there. Any idea how I can debug that, and what kind of errors am I looking for? It's my first time using indexed VBOs :)

Quote from: spasi on July 28, 2012, 15:02:17
It would certainly help if you could come up with a simpler test that reproduces the issue, something that you could post here for others to run.
We'll try :-)


Liosan

spasi

Quote from: Liosan on July 28, 2012, 18:46:47Ok, the ByteBuffer.putShort method looks promising, but the code above still requires a convertion, specifically for numbers within the 32k-64k range into a 'short' variable.
Anyways, I tested it with a small model, and using the code above doesn't help :(

You don't need to do anything special for the 32k-64k range. Assuming you're certain there won't be a 64k+ index in the data, at load time you'd do:

for (int i = 0; i < mesh.indicesCnt; i++)
    indicesBuffer.put(scanner.nextShort());


Then it's just a matter of the scanner returning the proper data.

Quote from: Liosan on July 28, 2012, 18:46:47Well, that sounds reasonable... the model is imported from a custom exporter, so there sure might be errors there. Any idea how I can debug that, and what kind of errors am I looking for? It's my first time using indexed VBOs :)

Hmm, a wrong byte order would cause trouble. E.g. writing the mesh in little-endian and reading in big-endian would reverse the byte order and mess up the indices.

If you want to be certain it's not an IO or GC issue, you can read-back the VBO data after initialization and check if everything is as you'd expect.

Another random thought: Have you checked if there's another vertex attrib array enabled (other than 0, 1, 2)?

Liosan

Quote from: spasi on July 28, 2012, 19:06:34
for (int i = 0; i < mesh.indicesCnt; i++)
    indicesBuffer.put(scanner.nextShort());


Then it's just a matter of the scanner returning the proper data.
The scanner throws: :(
Quote
java.util.InputMismatchException: Value out of range. Value:"32768" Radix:10
   at java.util.Scanner.nextShort(Unknown Source)
This is of course an academic discussion; if I needed to put shorts there, I'd find a way, even if I had to put them there byte-by-byte. But, as I checked for a small model, it doesn't help :(

Quote from: spasi on July 28, 2012, 19:06:34
Hmm, a wrong byte order would cause trouble. E.g. writing the mesh in little-endian and reading in big-endian would reverse the byte order and mess up the indices.
Hm, but that would fail consistently, not randomly, right?

Quote from: spasi on July 28, 2012, 19:06:34
If you want to be certain it's not an IO or GC issue, you can read-back the VBO data after initialization and check if everything is as you'd expect.
Good idea, I'll try that!

Quote from: spasi on July 28, 2012, 19:06:34
Another random thought: Have you checked if there's another vertex attrib array enabled (other than 0, 1, 2)?
Yes, there is. The terrain renderer doesn't clean up after itself, and it's got like 6 different vertex attrib arrays. Could it cause such behaviour? I'll check tomorrow :)

Liosan

spasi

Quote from: Liosan on July 28, 2012, 19:24:14
Quote from: spasi on July 28, 2012, 19:06:34
Another random thought: Have you checked if there's another vertex attrib array enabled (other than 0, 1, 2)?
Yes, there is. The terrain renderer doesn't clean up after itself, and it's got like 6 different vertex attrib arrays. Could it cause such behaviour? I'll check tomorrow :)

This could be the problem, yes. I can't say for sure without testing, but I believe that OpenGL will source all enabled vertex arrays, irrespectively of what the vertex shader is using. This would of course lead to illegal memory accesses on the unrelated vertex arrays. It would also explain the randomness, I'm guessing the terrain vertex arrays change in size depending on the camera location/orientation. When they become smaller than the mesh vertex arrays, you get a crash.

broumbroum

hello llosan,
I currently use vertexes array for drawing simple shapes in my game. It features rendering of a lot of small font characters and also 2D shapes like HUD objects.

I watched at your render code and following the guidelines of songho's tutorials (http://www.songho.ca/opengl/gl_vertexarray.html) , you don't use the same pointers, i.e. gl*attribpointer methods. I render everything by enabling the vertex array object client state bit and pointing to the buffers with repsectively glEnableClientState() and glVertexPointer(). Seems like you are mixing methods. Also, this is part of the VAO rendering, which is a kind of a "soft" rendering, because Vertex Arrays live in the client RAM and need one CPU cycle to transfer themselves to the GPU frame renderer.

VBO are bound with the ARB specs, which are named *ARB, e.g. glGenBuffersARB(). VertexBufferObjects are known to be faster : they live in VRAM (GPU Memory), and because of the asynchronous transfers of datas between openGL accelerated VRAM and the render frame buffer, the CPU can "process some other task" meanwhile. VBO may enhance the system stability thanks to this particularity.

See the schemes of the PBO data processing, really thoughtful. (Opengl can swap between VBO as it does with PBO)
source : WWW.SONGHO.CA : about PBO

^^

Liosan

Quote from: spasi on July 28, 2012, 21:12:37
This could be the problem, yes.
Bingo! A few glDisableVertexAttribArray calls and everything works fine. Thanks a lot :) we were completely stuck.

@broumbroum: I think you are mistaken; I think we are using VBOs in a 3.0+ way (so we don't need ARB extension), and VAO are used to group VBO binding parameters into an easier to manage structure. At least, that's what my OpenGL SuperBible (V edition) seems to indicate. I also think that my VBOs are stored on the GPU, thanks to our glBufferData calls.

Liosan