OpenGL: How to get GPU usage percent?
Is this eve开发者_StackOverflow社区n possible?
Not really, but you can get differente performance counters using your vendor's utilities, for NVIDIA you have NVPerfKit and NVPerfHUD. Other vendors have similar utilities.
Nope. It's even hard to rigorously define in such a highly parallel environment. However you can approximate it with ARB_timer_query extension.
I have implemented a timer query based GPU execution time measurement framework in my OpenGL rendering thread implementation. I'll share the timer query parts below:
Assume
enqueue
runs a function on the rendering threadlimiter.frame60
is only equal to 0 once every 60 frames
Code:
struct TimerQuery
{
std::string description;
GLuint timer;
};
typedef std::deque<TimerQuery> TimerQueryQueue;
...
TimerQueryQueue timerQueryQueue;
...
void GlfwThread::beginTimerQuery(std::string description)
{
if (limiter.frame60 != 0)
return;
enqueue([this](std::string const& description) {
GLuint id;
glGenQueries(1, &id);
timerQueryQueue.push_back({ description, id });
glBeginQuery(GL_TIME_ELAPSED, id);
}, std::move(description));
}
void GlfwThread::endTimerQuery()
{
if (limiter.frame60 != 0)
return;
enqueue([this]{
glEndQuery(GL_TIME_ELAPSED);
});
}
void GlfwThread::dumpTimerQueries()
{
while (!timerQueryQueue.empty())
{
TimerQuery& next = timerQueryQueue.front();
int isAvailable = GL_FALSE;
glGetQueryObjectiv(next.timer,
GL_QUERY_RESULT_AVAILABLE,
&isAvailable);
if (!isAvailable)
return;
GLuint64 ns;
glGetQueryObjectui64v(next.timer, GL_QUERY_RESULT, &ns);
DebugMessage("timer: ",
next.description, " ",
std::fixed,
std::setprecision(3), std::setw(8),
ns / 1000.0, Stopwatch::microsecText);
glDeleteQueries(1, &next.timer);
timerQueryQueue.pop_front();
}
}
Here is some example output:
Framerate t=5.14 fps=59.94 fps_err=-0.00 aet=2850.67μs adt=13832.33μs alt=0.00μs cpu_usage=17%
instanceCount=20301 parallel_μs=2809
timer: text upload range 0.000μs
timer: clear and bind 95.200μs
timer: upload 1.056μs
timer: draw setup 1.056μs
timer: draw 281.568μs
timer: draw cleanup 1.024μs
timer: renderGlyphs 1.056μs
Framerate t=6.14 fps=59.94 fps_err=0.00 aet=2984.55μs adt=13698.45μs alt=0.00μs cpu_usage=17%
instanceCount=20361 parallel_μs=2731
timer: text upload range 0.000μs
timer: clear and bind 95.232μs
timer: upload 1.056μs
timer: draw setup 1.024μs
timer: draw 277.536μs
timer: draw cleanup 1.056μs
timer: renderGlyphs 1.024μs
Framerate t=7.14 fps=59.94 fps_err=-0.00 aet=3007.05μs adt=13675.95μs alt=0.00μs cpu_usage=18%
instanceCount=20421 parallel_μs=2800
timer: text upload range 0.000μs
timer: clear and bind 95.232μs
timer: upload 1.056μs
timer: draw setup 1.056μs
timer: draw 281.632μs
timer: draw cleanup 1.024μs
timer: renderGlyphs 1.056μs
This allows me to call renderThread->beginTimerQuery("draw some text");
before my opengl draw calls or whatever, and renderThread->endTimerQuery();
right after it, to measure the elapsed GPU execution time.
The idea here is, it issues a command to the GPU command queue right before the measured section, so glBeginQuery
TIME_ELAPSED
records the value of some implementation defined counter. The glEndQuery
issues a GPU command to store the difference between the current count and the one stored at the beginning of the TIME_ELAPSED
query. That result is stored by the GPU in the query object and is "available" at some asynchronous future time. My code keeps a queue of issued timer queries and checks once per second for finished measurements. My dumpTimerQueue
keeps printing the measurements as long as the timer query at the head of the queue is still available. Eventually it hits a timer that is not available yet and stops printing messages.
I added an additional feature that it drops 59 out of 60 calls to the measurement functions, so it only measures once per second for all the instrumentation in my program. This prevents too much spam and makes it usable to dump to stdout for development, and prevents too much performance interference caused by the measurements. That is what the limiter.frame60 thing is, frame60 is guaranteed to be < 60. It wraps.
While this doesn't perfectly answer the question, you can infer the GPU usage by noting the elapsed time for all of the draw calls vs the elapsed wall clock time. If the frame was 16ms and the timer query TIME_ELAPSED was 8ms, you can infer approximately 50% GPU usage.
One more note: the measurement is measured GPU execution time, by putting GPU commands in the GPU queue. The threading has nothing to do with it, if the operations inside those enqueue
were executed in one thread it would be equivalent.
I never have seen anything like that. Normally you render a frame as fast as possible do some CPU frame post- or pre-processing and render the next one, so usage flaps between 0 and 100%. Only rarely the FPS are limited to a maximum number and only in this case would this be a meaningful number.
精彩评论