glDrawElements and low FPS

Started by newb, May 22, 2012, 16:56:21

Previous topic - Next topic

newb

Hi everyone,

im new to OpenGL and of course lwjgl. I have a question about calling the method glDrawElements:
I want to render ~200 object (150 Object look the same) and for this i use VBO`s.
Each object has its own render method where binding of the position, rotation and scalation as a uniformmatrix and the call to glDrawElements takes place.

The Problem is i have a low Framerate (~50fps) but my objects (for now) consist of simple quads.

Later i want to use much more complex objects and other computations...

I thought glDrawElements activates only the rendering on the GPU, but why does it take so much time?
Did i take a wrong approach?

Thanks for help!

matheus23

Maybe you made a mistake, and uploaded the VBO data every frame... Show us some rendering code, that would help a lot ;)
My github account and currently active project: https://github.com/matheus23/UniverseEngine

newb

The render-func for each of the 200 OBjects looks like this:
public void render() {
   GL11.glDrawElements( OpenGL.GL_TRIANGLES_STRIP, 16, OpenGL.GL_UNSIGNED_INT, 0 );
}

I created 200 vertex array objects, but i bound only the last one (last construktor call).
C-Tor of each object (where f is a float array and i is an integer array)
...
vaID = OpenGL.createVertexArray( f, i );      
OpenGL.glBindVertexArray( vaID );
...



The render-func in my OpenGL-class (Wrapper-Class) looks like this:

   public static void render() {
      long time = 0;
      int frames  = 0;
      
      
      while( !Display.isCloseRequested() ) {
         
         if( Display.wasResized() ) {
            OpenGL.resize();
         }
         OpenGL.getMouseInputs();
         OpenGL.glClear( GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT );
         
         
         for ( Renderable element : renderList ) {
            element.render();
         }
         Display.update();
         
         // Calc FPS
         frames++;
         if( ( System.currentTimeMillis() - time ) >= 1000 ) {
            Display.setTitle( OpenGL.displayTitle + " FPS: " + (int)((frames * 1000)/( System.currentTimeMillis() - time )) );
            time = System.currentTimeMillis();
            frames = 0;
         }
      }
   }


What`s the problem?

Thx 4 help!

newb

Hi all,

does noone have a clue? Modern gameengines render much more than 1000 objects. (I know they use c++ instead)
Please help me, if u can.

Thx!

Fool Running

Well, my first question is: What graphics card do you have? This can make a HUGE difference.
Intel integrated cards = junk.
ATI/Nvidia cards = good.

I assume you aren't doing anything with shaders or any other complex processing (like motion blur or something) that might slow it down.
Programmers will, one day, rule the world... and the world won't notice until its too late.Just testing the marquee option ;D

newb

Hi,
i have a Laptop with a Nvidia 9600M. I can play (Team Fortess2) just fine!
When i render only 1 VBO my Framerate is about 1900 FPS.
But only when i call 200 times the glDrawElements it slows down about 50 FPS.

My VS-Shader:
#version 330 core

const vec3		inverseLightDir = vec3 ( 0.0, 0.0, 1.0 );

in		vec3	vertices;
in		vec3	normals;
in		vec3	colors;
out		float	brightness;
out		vec3	colors_fs;
uniform	mat4	mat;
//uniform mat4	projectionMat;

void main()
{
	gl_Position = mat4( 0.5f, 0.0f, 0.0f, 0.0f,
										0.0f, 0.5f, 0.0f, 0.0f,
										0.0f, 0.0f, 0.5f, 0.0f,
										0.0f, 0.0f, 0.0f, 1.0f ) * mat * vec4( vertices, 1);
	vec3 normal = normalize( mat3(mat) * normals );
	brightness	= max( dot( inverseLightDir, normal ), 0 );
	colors_fs   = colors;
}


My Fragment-Shader:
#version 330 core

in vec3 colors_fs;
in float	brightness;
out vec4 pixelColor;

void main( void )
{
	pixelColor = brightness * vec4( colors_fs, 1.0f );
}


Method for creating a vao looks like:
public static int createVertexArray ( float[] vertices, int[] indices ) {
		int vertexArrayID = OpenGL.glGenVertexArrays();
		OpenGL.glBindVertexArray( vertexArrayID );
		
		FloatBuffer fb = BufferUtils.createFloatBuffer( vertices.length );
		fb.put( vertices );
		fb.position( 0 );
		int vertexBufferID = OpenGL.glGenBuffers();
		OpenGL.glBindBuffer( OpenGL.GL_ARRAY_BUFFER, vertexBufferID );
		OpenGL.glBufferData( OpenGL.GL_ARRAY_BUFFER, fb, OpenGL.GL_STATIC_DRAW );
		
		IntBuffer ib = BufferUtils.createIntBuffer( indices.length );
		ib.put( indices );
		ib.position( 0 );
		int indexBufferID = OpenGL.glGenBuffers();
		OpenGL.glBindBuffer( OpenGL.GL_ELEMENT_ARRAY_BUFFER, indexBufferID );
		OpenGL.glBufferData( OpenGL.GL_ELEMENT_ARRAY_BUFFER, ib, OpenGL.GL_STATIC_DRAW );
		
		// vertices
		OpenGL.glEnableVertexAttribArray( OpenGL.VS_ATTRIB_VERTICES );
		OpenGL.glVertexAttribPointer( OpenGL.VS_ATTRIB_VERTICES, 3, OpenGL.GL_FLOAT, false, 3*3*4, 0 );
		// normals
		OpenGL.glEnableVertexAttribArray( OpenGL.VS_ATTRIB_NORMALS );
		OpenGL.glVertexAttribPointer( OpenGL.VS_ATTRIB_NORMALS, 3, OpenGL.GL_FLOAT, false, 3*3*4, 1*3*4 );
		// colors
		OpenGL.glEnableVertexAttribArray( OpenGL.VS_ATTRIB_COLORS );
		OpenGL.glVertexAttribPointer( OpenGL.VS_ATTRIB_COLORS, 3, OpenGL.GL_FLOAT, false, 3*3*4, 2*3*4 );

		return vertexArrayID;
	}

Fool Running

I would try using the fixed-function pipeline (getting rid of using shaders) or try simplify your shaders as much as possible (i.e. just make the vertex shader just set the position and the fragment shader just set the color to some constant value). I don't know a lot about shaders, but they look needlessly complex to me which might be slowing it down. Especially the creation of the matrix in the vertex shader.
Programmers will, one day, rule the world... and the world won't notice until its too late.Just testing the marquee option ;D

newb

Hi,

yes i know. My shaders look a bit dirty because im testing a lot.
But i changed everything: the vertexshader only calculates the position. Not a big deal.
But i didnt get more FPS. I think the problem is the function itself, or calling the function via lwjgl.

I dont get it, whats the big deal of calling glDrawElements ~400 times each frame.
I thought that this should be very fast.
Copying the data into the buffers should take much more time... but thats another thing.


Someone else has a clue?


Im sorry for my bad english...

Timo

newb

ps. i figured out that i get a FPS boost (about 2times) when i enable GL_CULL_FACE
I know what culling is and why this is faster.
But why exactly do i have a FPS boost? Is it because the glDrawElements function halts until the redering was completed?

Thx

matheus23

I don't really get, why it's not self-explaining... You said you know, why face culling makes things go faster?

Idk... maybe you understood face-culling wrong, thats how it's done:

Face culling just does not render faces, which do not face the eye-point (strange sentence...). That's simply done by saying, that counter-clockwise rendered Faces are facing the eye and faces, which are cw (clock-wise) after transformation do not get rendered. That "algorithm" is rather simple, and really fast and the performance boost is due to lots of less vertex shader and fragment shader calls. Also the fillrate is much lower, since you only render about half the faces.

Thats it. I think its one of the worst explanations, but I think you should get, if you understood face-culling wrong, or else it clears things up... I think...

One more thing: I don't get what you mean with "the glDrawElements function halts"... No, it doesn't? It's just as usual.
My github account and currently active project: https://github.com/matheus23/UniverseEngine

newb

Hi matheus23,

thanks for the information about face-culling. But i understood it quiet well.
I meant:

glDrawElements(...);
// Are the elements completely rendered? Or is the graphicscard rendering a this time and the program proceeds with other commands?
<someOtherFunctionCall>(...);


THX

matheus23

Quote from: newb on May 28, 2012, 08:37:43
Hi matheus23,

thanks for the information about face-culling. But i understood it quiet well.
I meant:

glDrawElements(...);
// Are the elements completely rendered? Or is the graphicscard rendering a this time and the program proceeds with other commands?
<someOtherFunctionCall>(...);


THX

:D ahh okey, blame me :)
Yep, it is completly rendered ;)
My github account and currently active project: https://github.com/matheus23/UniverseEngine

Fool Running

Quote from: matheus23 on May 29, 2012, 08:49:33
Yep, it is completly rendered ;)
Actually, that's not entirely correct. When you send the data to OpenGL, the drivers queue up the drawing commands and the GPU works on them as fast as it can. The call to Display.update() is what waits until the GPU is finished rendering everything before continuing. That's why if you profile most LWJGL applications, most of the time seems to be spent in Display.update() - because it's waiting for the GPU to finish its work.
Programmers will, one day, rule the world... and the world won't notice until its too late.Just testing the marquee option ;D

matheus23

Quote from: Fool Running on May 29, 2012, 12:54:53
Quote from: matheus23 on May 29, 2012, 08:49:33
Yep, it is completly rendered ;)
Actually, that's not entirely correct. When you send the data to OpenGL, the drivers queue up the drawing commands and the GPU works on them as fast as it can. The call to Display.update() is what waits until the GPU is finished rendering everything before continuing. That's why if you profile most LWJGL applications, most of the time seems to be spent in Display.update() - because it's waiting for the GPU to finish its work.

Blame me again. Sorry for giving wrong information. Thought about it that way, had wrong information sources too...
My github account and currently active project: https://github.com/matheus23/UniverseEngine

newb

Quote from: Fool Running on May 29, 2012, 12:54:53
Quote from: matheus23 on May 29, 2012, 08:49:33
Yep, it is completly rendered ;)
Actually, that's not entirely correct. When you send the data to OpenGL, the drivers queue up the drawing commands and the GPU works on them as fast as it can. The call to Display.update() is what waits until the GPU is finished rendering everything before continuing. That's why if you profile most LWJGL applications, most of the time seems to be spent in Display.update() - because it's waiting for the GPU to finish its work.

Thx, Fool Running!
But isnt it a bit weird, that my program slows down that much? i mean i "only" render hundreds vao`s (=function call).