开发者

How to convert to 16 bit pixel format for use in OpenGL ES 2.0 when reading frames from a video (AV Foundation)

I'm using OpenGL to do some image processing on each frame of a 1280x720 quicktime video. The frames are then read back and a new video is created from these. The problem is the large amount of data that needs to be transfered to and from OpenGL (using glTexImage2D and glReadPixels), resulting in a very slow process.

Currently, I'm using kCVPixelFormatType_32BGRA as the pixel format for my AVAssetReaderTrackOutput instance. To decrease time consumption, I would like to use a 16 bit pixel format instead. Unfortunately, changing to such a format gives me empty frames when calling AVAssetReaderTrackOutput's copyNextS开发者_运维知识库ampleBuffer method. Does anyone have experience with using a 16 bit pixel format in AV Foundation?

If I can't get AV Foundation to change the format for me, I suppose I could convert from 32 bit to 16 bit "manually", maybe using NEON instructions? Any help is appreciated.


A further revision, and this is now community wiki because I've made so many errors in answering this question alone that it makes sense.

Although CoreGraphics would prima facie be able to do a 32bit to 16bit conversion for you using something like the following code, it instead reports that "4 integer bits/component; 16 bits/pixel; 3-component color space; kCGImageAlphaPremultipliedLast" is an unsupported parameter combination. So it seems that CoreGraphics can't internally comprehend 4 bits/channel images.

CGColorSpaceRef colourSpace = CGColorSpaceCreateDeviceRGB();
CGDataProviderRef dataProvider = CGDataProviderCreateWithData(NULL, buffer, width*height*4, NULL);
CGImageRef inputImage = CGImageCreate(  width, height,
                                        8, 32, width*4, 
                                        colourSpace, 
                                        kCGImageAlphaPremultipliedLast | kCGBitmapByteOrder32Big,
                                        dataProvider,
                                        NULL, NO,
                                        kCGRenderingIntentDefault);
CGDataProviderRelease(dataProvider);

unsigned char *outputImage = (unsigned char *)malloc(width*height*2);
CGContextRef targetContext = CGBitmapContextCreate( outputImage,
                                                    width, height,
                                                    4, width*2,
                                                    colourSpace, kCGImageAlphaPremultipliedLast | kCGBitmapByteOrder32Big);
CGContextDrawImage(targetContext, CGRectMake(0, 0, width, height), inputImage);

/* uplopad outputImage to OpenGL here! */

CGContextRelease(targetContext);
CGImageRelease(inputImage);
CGColorSpaceRelease(colourSpace);
free(outputImage);

However, per the documentation:

Supported pixel formats are kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange, kCVPixelFormatType_420YpCbCr8BiPlanarFullRange and kCVPixelFormatType_32BGRA, except on iPhone 3G, where the supported pixel formats are kCVPixelFormatType_422YpCbCr8 and kCVPixelFormatType_32BGRA.

So for reducing the size of what you receive, you could switch to a YCbCr colour space. As the buffers come back biplanar (ie, all y components for the entire image, then all Cb and Cr components as a separate block), you can upload them as two individual textures to OpenGL and recombine in a shader, assuming you're happy limiting yourself to the 3GS and above, and can afford to spend 2 texture units of the 8 that are available on SGX iOS devices.

YCbCr is a colour space that represents colour as brightness (the Y) and colour (the CbCr) separately. It's been shown empirically that the colour channel can be sampled at a lower frequency than the brightness without anybody being able to tell. The '420' part of the pixel format describes how many Cb and Cr components you get for each 4 Y components — essentially it's telling you that you get one sample of Cb and one of Cr for every four samples of Y. Hence you have a total of six bytes to describe four pixels, for 12 bits/pixel rather than 24 bits/pixel in RGB. That saves you 50% of your storage.

For GL purposes you've potentially incurred an extra cost because it's two uploads rather than one. You're also going to need to use three varyings if you want to avoid dependent texture reads, and I think the SGX is limited to eight of those.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜