Speed problem with VBOs

Started by Fool Running, December 09, 2004, 14:49:26

Previous topic - Next topic

Fool Running

I know this is simmilar to another post on the forum, but my case is slightly different....

I changed my game engine to use VBOs instead of display lists.  I've gathered (from this forum and in other places) that VBOs should be just as fast, if not faster,  than using display lists.
I have 2 VBOs for my model.  One VBO contains all the info for the model interleaved (x,y,z,cr,cg,cb,ca,nx,ny,nz,tx,ty) and the other is an index list so I can use glDrawRangeElements().  I've read that this is the fastest way to set up the VBOs, but I am still showing a decrease of about 10 percent in the speed over display lists.  :?

Is there something I'm setting up wrong (I'll post my code when I get back home  :wink: ) or are VBOs not faster or just not faster with LWJGL?

I'm have a Geforce FX 5600 with the latest drivers, WinXP pro, P4 1.8, LWJGL 0.92
Programmers will, one day, rule the world... and the world won't notice until its too late.Just testing the marquee option ;D

princec

Actually display lists should really be the fastest way of drawing in OpenGL. They are stored serverside, actually on the card - whereas a VBO is typically going to be in AGP RAM, and that incurs a small penalty as it's got to stream the data to the card every frame.

The penalty is generally worth it though because you can change bits of the data dynamically every frame at little cost in a VBO. But a display list has to be destroyed and recreated in its entirety incurring far more overhead.

Blimey, and I haven't even upgraded the SPGL sprite engine to use VBOs yet! Maybe OrangyTang will do it for me :)

Cas :)

Fool Running

I was under the impression that VBOs were stored on the card.:
http://nehe.gamedev.net/data/lessons/lesson.asp?lesson=45

in buildVBOs():
QuoteOur Copy Of The Data Is No Longer Necessary, It Is Safe In The Graphics Card

So this is not the case  :cry:  Oh, well, back to display lists  :lol:

Thanks for the fast reply  8)  LWJGL is awsome
Programmers will, one day, rule the world... and the world won't notice until its too late.Just testing the marquee option ;D

princec

The precise location of a VBO depends on the usage you ask of it. As I've not actually used 'em before I can't remember what the flags are you pass in when you create them but basically:

- If you ever need to read data out of a VBO it's going to end up in system RAM

- If you frequently write data to a VBO it'll be in AGP RAM.

- If you infrequently write data to a VBO it'll be in card RAM.

Cas :)

Fool Running

I'm using GL_STATIC_DRAW_ARB which I think is the last choice... hmmmmmm :?  Should it be faster then?
Programmers will, one day, rule the world... and the world won't notice until its too late.Just testing the marquee option ;D

spasi

Quote from: "Fool Running"
QuoteOur Copy Of The Data Is No Longer Necessary, It Is Safe In The Graphics Card

So this is not the case  :cry:  Oh, well, back to display lists  :lol:

Actually, you don't have to care where it will be stored. It is entirely possible to be stored on the graphics cards, just like textures. The driver is responsible for managing that.

Also, don't forget that when comparing VBO & DL speed, you should take into account different implementations, often giving you different results.

And I believe that VBOs are more "memory efficient" than DLs (I don't remember any details though).

spasi

Quote from: "Fool Running"I'm using GL_STATIC_DRAW_ARB which I think is the last choice... hmmmmmm :?  Should it be faster then?

STATIC_DRAW is the best option for static data. But there are other things that may affect speed, like the vertex format used (efficient data types, supported "data path", total size in bytes), number of indices, bugs in driver, etc.

One last thing: How large are your meshes? Try comparing larger.

Orangy Tang

Quote from: "princec"
Blimey, and I haven't even upgraded the SPGL sprite engine to use VBOs yet! Maybe OrangyTang will do it for me :)

Cas :)
Last time I checked (a few months ago) VBOs were still somewhat dodgy in both current nVidia and ATi cards (all sorts of odd workarounds for one driver or another) and getting the same behaviour/performance out of both was pretty damn difficult.

I'd hope that its improved by now, but its still something of a black art from what I've seen. :(

Fool Running

QuoteBut there are other things that may affect speed, like the vertex format used (efficient data types, supported "data path", total size in bytes), number of indices, bugs in driver, etc.
By vertex format do you mean the stored format?  I'm using floats. Seudo code:
public void createVBO(){
    // Crate interleaved float array
    // Store entire array in FloatBuffer
    id = glGenBuffer();
    glBindBuffer(id, GL_STATIC_DRAW);
    glBufferData(FloatBuffer, GL_STATIC_DRAW);
    // crate ShorBuffer of indicies
    id2 = glGenBuffer();
    glBindBuffer(id2, GL_STATIC_DRAW);
    glBufferData(ShorBuffer, GL_STATIC_DRAW);
}

public void render(){
    glBindBuffer(id, GL_STATIC_DRAW);
    // setup pointers to VBO
    glBindBuffer(id2, GL_STATIC_DRAW);
    glDrawRangeElements();
}

Sorry, but I don't have the exact code here so I may have miss-typed some of it  :roll:

QuoteOne last thing: How large are your meshes? Try comparing larger.
I'm using small meshes (1300 polys)

Thanks for all your help guys. I'll try playing around with it more and see if I can make it better. :lol:
Programmers will, one day, rule the world... and the world won't notice until its too late.Just testing the marquee option ;D

spasi

Quote from: "Fool Running"By vertex format do you mean the stored format? I'm using floats.

I mean the attributes present for each vertex (positions, normals, texcoords, etc) and their data types.

One example I had problem with was bone indices. I assumed an UNSIGNED_BYTE format would be ideal (we have much less than 256 bones :wink:), but for some reason that simple data format is not properly supported for generic vertex attributes. The result, of course, was 1-5fps... I had to use SHORT (not even unsigned!) for it to work fast.

This happened on an NV, don't know about ATIs (both work great with SHORT now). Although, it may have been a driver issue that is fixed now. Hmm, I should test this...