Why is glClear() so slow with point sprites on iPhone?
I am trying to draw point sprites with OpenGL ES on iPhone. It's possible there could be very many of them (1000) and up to 64 pixels wide (maybe that's my problem right there - is there a limit or could I be using too much memory?)
I am using CADisplayLink to time the frames. What happens is that the first gl drawing function tends to delay or stall when either the point count is too high or when the point size is too big. In my example below, glClear() is the first drawing function, and it can take anywhere from 0.02 seconds to 0.2 seconds to run. If I simply comment out glClear, glDrawArrays becomes the slow function (it runs very fast开发者_JS百科 otherwise).
This example is what I've stripped my code down to in order to isolate the problem. It simply draws a bunch of point sprites, with no texture, all in the same spot. I am using VBOs to store all the sprite data (position, color, size). It may seem like overkill for the example but of course I have intentions to modify this data later.
This is the view's init function (minus the boilerplate gl setup):
glDisable(GL_DEPTH_TEST);
glDepthMask(GL_FALSE);
glDisable(GL_LIGHTING);
glDisable(GL_FOG);
glEnable(GL_TEXTURE_2D);
glEnable(GL_BLEND);
glBlendEquationOES(GL_FUNC_ADD_OES);
glClearColor(0.0, 0.0, 0.0, 0.0);
glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_MODULATE);
glTexEnvi(GL_POINT_SPRITE_OES, GL_COORD_REPLACE_OES, GL_TRUE);
glEnable(GL_POINT_SPRITE_OES);
glEnableClientState(GL_VERTEX_ARRAY);
glEnableClientState(GL_POINT_SIZE_ARRAY_OES);
glEnableClientState(GL_COLOR_ARRAY);
glBlendFunc(GL_SRC_ALPHA, GL_ONE);
glEnable(GL_POINT_SMOOTH);
glGenBuffers(1, &vbo); // vbo is an instance variable
glBindBuffer(GL_ARRAY_BUFFER, vbo);
glMatrixMode(GL_PROJECTION);
glOrthof(0.0, [self frame].size.width, 0.0, [self frame].size.height, 1.0f, -1.0f);
glViewport(0, 0, [self frame].size.width, [self frame].size.height);
glMatrixMode(GL_MODELVIEW);
glTranslatef(0.0f, [self frame].size.height, 0.0f);
glScalef(1.0f, -1.0f, 1.0f);
And this is the rendering function:
- (void)render
{
glClear(GL_COLOR_BUFFER_BIT); // This function runs slowly!
int pointCount = 1000;
// fyi...
// typedef struct {
// CGPoint point;
// CFTimeInterval time;
// GLubyte r, g, b, a;
// GLfloat size;
// } MyPoint;
glBufferData(GL_ARRAY_BUFFER, sizeof(MyPoint)*pointCount, NULL, GL_DYNAMIC_DRAW);
MyPoint * vboBuffer = (MyPoint *)glMapBufferOES(GL_ARRAY_BUFFER, GL_WRITE_ONLY_OES);
for (int i = 0; i < pointCount; i++) {
vboBuffer[i].a = (GLubyte)0xFF;
vboBuffer[i].r = (GLubyte)0xFF;
vboBuffer[i].g = (GLubyte)0xFF;
vboBuffer[i].b = (GLubyte)0xFF;
vboBuffer[i].size = 64.0;
vboBuffer[i].point = CGPointMake(200.0, 200.0);
}
glUnmapBufferOES(GL_ARRAY_BUFFER);
glPointSizePointerOES(GL_FLOAT, sizeof(MyPoint), (void *)offsetof(MyPoint, size));
glColorPointer(4, GL_UNSIGNED_BYTE, sizeof(MyPoint), (void *)offsetof(MyPoint, r));
glVertexPointer(2, GL_FLOAT, sizeof(MyPoint), (void *)offsetof(MyPoint, point));
glDrawArrays(GL_POINTS, 0, pointCount);
[context presentRenderbuffer:GL_RENDERBUFFER_OES];
}
Why is the glClear function stalling? It doesn't just delay in random amounts - depending on the point count or size, it tends to randomly delay in the same intevals (eg. 0.015 sec, 0.030 sec, 0.045 sec, etc). Also something strange I noticed is that if I switch to glBlendMode(GL_ZERO, GL_ONE), it runs just fine (although this is will not be the visual effect I'm after). Other glBlendMode values change the speed as well - usually for the better. That makes me think it is not a memory issue because that has nothing to do with the VBO (right?).
I admit I am a bit new at OpenGL and may be misunderstanding basic concepts about VBOs or other things. Any help or guidance is greatly appreciated!
If glClear()
is slow you might try drawing a large blank quad that completely covers the viewport area.
Are you using sync (or is it enabled?). The delay you're seeing might be related to the fact that CPU and GPU run in parallel, so measuring time of individual GL calls has no meaning.
If you're using VSync (or the GPU is heavily loaded), there might be some latency in the SwapBuffers call, since some drivers make busy loops to wait for VBlank.
But first consider that you should NOT time individual GL calls, since most GL calls just set some state of the GPU or write to a command buffer, the command execution happens asynchronously.
精彩评论