OpenCV: in search for less CPU intensive frame capture+resize and into buffer way: how to optimize my code?
So I created a function (C++)
void CaptureFrame(char* buffer, int w, int h, int bytespan)
{
/* get a frame */
if(!cvGrabFrame(capture)){ // capture a frame
printf("Could not grab a frame\n\7");
//exit(0);
}
CVframe =cvRetrieveFrame(capture); // retrieve the captured frame
/* always check */
if (!CVframe)
{
printf("No CV frame captured!\n");
cin.get();
}
/* resize buffer for current frame */
IplImage* destination = cvCreateImage(cvSize(w, h), CVframe->depth, CVframe->nChannels);
//use cvResize to resize source to a destination image
cvResize(CVframe, destination);
IplImage* redchannel = cvCreateImage(cvGetSize(destination), 8, 1);
IplImage* greenchannel = cvCreateImage(cvGetSize(destination), 8, 1);
IplImage* bluechannel = cvCreateImage(cvGetSize(destination), 8, 1);
cvSplit(destination, bluechannel, greenchannel, redchannel, NULL);
for(int y = 0; y < destination->height; y++)
{
char* line = buffer + y * bytespan;
for(int x = 0; x < destination->width; x++)
{
line[0] = cvGetReal2D(redchannel, y, x);
line[1] = cvGetReal2D(greenchannel, y, x);
line[2] = cvGetReal2D(bluechannel, y, x);
line += 3;
}
}
cvReleaseImage(&开发者_如何学Python;redchannel);
cvReleaseImage(&greenchannel);
cvReleaseImage(&bluechannel);
cvReleaseImage(&destination);
}
So generally it captures a frame from device, creates a frame to resize into and copies it into buffer (RGB or YUV420P is requirement for me).
So I wonder what I do wrong, because my function is way 2 cpu intensive, and what can be done to fix it?
Update:
My function is runed in thread:
void ThreadCaptureFrame()
{
while(1){
t.restart();
CaptureFrame((char *)frame->data[0], videoWidth, videoHeight, frame->linesize[0]);
AVFrame* swap = frame;
frame = readyFrame;
readyFrame = swap;
spendedTime = t.elapsed();
if(spendedTime < desiredTime){
Sleep(desiredTime - spendedTime);
}
}
}
which is started at the beginning of int main ( after some initialization):
boost::thread workerThread(ThreadCaptureFrame);
So if it can it runs 24 times per second, it eats 28% of core quad. cam resolution I capture is like 320x240. So: how to optimize it?
Things you can do:
- Instead of taking images from the camera at the default resolution, choose what resolution you want.
- I think you can simply set
buffer = destination->imageData
These articles might be helpful:
- http://aishack.in/tutorials/efficiently-accessing-matrices/
- http://aishack.in/tutorials/memory-layout-of-matrices-of-multidimensional-objects/
- First, don't allocate and the release the images per every frame!
That probably takes the most time. Have all your
IplImage
s pre-allocated and release them only when your app is done. You can useboost::shared_ptr
with a custom deleter to avoid needing to remember to release the images. - I don't get why you're splitting and why you're copying like that.
If you must copy, then just copy the whole of
destination->imageData
intobuffer
. If it is the padding that is buggung you then do it in a loop like you did, but directly fromdestination->imageData
. You dont need to separate the color channels. - Use
cvResize
withCV_INTER_NN
. That will reduce the image quality but is faster.
I'm not familiar with OpenCV, but if I'm reading your code correctly, you're:
- reading from camera's buffer to memory (1 copying)
- resizing the image (1 copying)
- splitting the image into RGB channel (3 copying)
- re-merge the channels to buffer (1 copying)
I think that's a lot of unnecessary copying, for each frame you made 6 copies of the image (i.e. if your image is 320x240 on 24-bit color and 24fps you'd be moving around at least 32MB/sec, with 1000x1000 frame you're talking about half gigabyte per second; note that this is a very crude back-of-the-envelope underestimate, depending on the resizing algorithm, extra copying may be done, reading/writing to non-aligned memory location may incur some overhead, etc, etc).
You can probably skip step #3 and/or #4, though I'm not familiar enough with OpenCV to suggest how.
精彩评论