开发者

OpenGL: How to get GPU usage percent?

Is this eve开发者_StackOverflow社区n possible?


Not really, but you can get differente performance counters using your vendor's utilities, for NVIDIA you have NVPerfKit and NVPerfHUD. Other vendors have similar utilities.


Nope. It's even hard to rigorously define in such a highly parallel environment. However you can approximate it with ARB_timer_query extension.


I have implemented a timer query based GPU execution time measurement framework in my OpenGL rendering thread implementation. I'll share the timer query parts below:

Assume

  • enqueue runs a function on the rendering thread
  • limiter.frame60 is only equal to 0 once every 60 frames

Code:

struct TimerQuery
{
    std::string description;
    GLuint timer;
};
typedef std::deque<TimerQuery> TimerQueryQueue;

...

TimerQueryQueue timerQueryQueue;

...

void GlfwThread::beginTimerQuery(std::string description)
{
    if (limiter.frame60 != 0)
        return;

    enqueue([this](std::string const& description) {
        GLuint id;
        glGenQueries(1, &id);
        timerQueryQueue.push_back({ description, id });
        glBeginQuery(GL_TIME_ELAPSED, id);
    }, std::move(description));
}

void GlfwThread::endTimerQuery()
{
    if (limiter.frame60 != 0)
        return;

    enqueue([this]{
        glEndQuery(GL_TIME_ELAPSED);
    });
}


void GlfwThread::dumpTimerQueries()
{
    while (!timerQueryQueue.empty())
    {
        TimerQuery& next = timerQueryQueue.front();

        int isAvailable = GL_FALSE;
        glGetQueryObjectiv(next.timer,
                           GL_QUERY_RESULT_AVAILABLE,
                           &isAvailable);
        if (!isAvailable)
            return;

        GLuint64 ns;
        glGetQueryObjectui64v(next.timer, GL_QUERY_RESULT, &ns);

        DebugMessage("timer: ",
                     next.description, " ",
                     std::fixed,
                     std::setprecision(3), std::setw(8),
                     ns / 1000.0, Stopwatch::microsecText);

        glDeleteQueries(1, &next.timer);

        timerQueryQueue.pop_front();
    }
}

Here is some example output:

Framerate t=5.14 fps=59.94 fps_err=-0.00 aet=2850.67μs adt=13832.33μs alt=0.00μs cpu_usage=17%
instanceCount=20301 parallel_μs=2809
timer: text upload range    0.000μs
timer: clear and bind   95.200μs
timer: upload    1.056μs
timer: draw setup    1.056μs
timer: draw  281.568μs
timer: draw cleanup    1.024μs
timer: renderGlyphs    1.056μs
Framerate t=6.14 fps=59.94 fps_err=0.00 aet=2984.55μs adt=13698.45μs alt=0.00μs cpu_usage=17%
instanceCount=20361 parallel_μs=2731
timer: text upload range    0.000μs
timer: clear and bind   95.232μs
timer: upload    1.056μs
timer: draw setup    1.024μs
timer: draw  277.536μs
timer: draw cleanup    1.056μs
timer: renderGlyphs    1.024μs
Framerate t=7.14 fps=59.94 fps_err=-0.00 aet=3007.05μs adt=13675.95μs alt=0.00μs cpu_usage=18%
instanceCount=20421 parallel_μs=2800
timer: text upload range    0.000μs
timer: clear and bind   95.232μs
timer: upload    1.056μs
timer: draw setup    1.056μs
timer: draw  281.632μs
timer: draw cleanup    1.024μs
timer: renderGlyphs    1.056μs

This allows me to call renderThread->beginTimerQuery("draw some text"); before my opengl draw calls or whatever, and renderThread->endTimerQuery(); right after it, to measure the elapsed GPU execution time.

The idea here is, it issues a command to the GPU command queue right before the measured section, so glBeginQuery TIME_ELAPSED records the value of some implementation defined counter. The glEndQuery issues a GPU command to store the difference between the current count and the one stored at the beginning of the TIME_ELAPSED query. That result is stored by the GPU in the query object and is "available" at some asynchronous future time. My code keeps a queue of issued timer queries and checks once per second for finished measurements. My dumpTimerQueue keeps printing the measurements as long as the timer query at the head of the queue is still available. Eventually it hits a timer that is not available yet and stops printing messages.

I added an additional feature that it drops 59 out of 60 calls to the measurement functions, so it only measures once per second for all the instrumentation in my program. This prevents too much spam and makes it usable to dump to stdout for development, and prevents too much performance interference caused by the measurements. That is what the limiter.frame60 thing is, frame60 is guaranteed to be < 60. It wraps.

While this doesn't perfectly answer the question, you can infer the GPU usage by noting the elapsed time for all of the draw calls vs the elapsed wall clock time. If the frame was 16ms and the timer query TIME_ELAPSED was 8ms, you can infer approximately 50% GPU usage.

One more note: the measurement is measured GPU execution time, by putting GPU commands in the GPU queue. The threading has nothing to do with it, if the operations inside those enqueue were executed in one thread it would be equivalent.


I never have seen anything like that. Normally you render a frame as fast as possible do some CPU frame post- or pre-processing and render the next one, so usage flaps between 0 and 100%. Only rarely the FPS are limited to a maximum number and only in this case would this be a meaningful number.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜