How to convert to 16 bit pixel format for use in OpenGL ES 2.0 when reading frames from a video (AV Foundation)

2023-03-08 10:52 问答作者：

I'm using OpenGL to do some image processing on each frame of a 1280x720 quicktime video. The frames are then read back and a new video is created from these. The problem is the large amount of data that needs to be transfered to and from OpenGL (using glTexImage2D and glReadPixels), resulting in a very slow process.

Currently, I'm using kCVPixelFormatType_32BGRA as the pixel format for my AVAssetReaderTrackOutput instance. To decrease time consumption, I would like to use a 16 bit pixel format instead. Unfortunately, changing to such a format gives me empty frames when calling AVAssetReaderTrackOutput's copyNextS开发者_运维知识库ampleBuffer method. Does anyone have experience with using a 16 bit pixel format in AV Foundation?

If I can't get AV Foundation to change the format for me, I suppose I could convert from 32 bit to 16 bit "manually", maybe using NEON instructions? Any help is appreciated.

A further revision, and this is now community wiki because I've made so many errors in answering this question alone that it makes sense.

Although CoreGraphics would prima facie be able to do a 32bit to 16bit conversion for you using something like the following code, it instead reports that "4 integer bits/component; 16 bits/pixel; 3-component color space; kCGImageAlphaPremultipliedLast" is an unsupported parameter combination. So it seems that CoreGraphics can't internally comprehend 4 bits/channel images.

CGColorSpaceRef colourSpace = CGColorSpaceCreateDeviceRGB();
CGDataProviderRef dataProvider = CGDataProviderCreateWithData(NULL, buffer, width*height*4, NULL);
CGImageRef inputImage = CGImageCreate(  width, height,
                                        8, 32, width*4, 
                                        colourSpace, 
                                        kCGImageAlphaPremultipliedLast | kCGBitmapByteOrder32Big,
                                        dataProvider,
                                        NULL, NO,
                                        kCGRenderingIntentDefault);
CGDataProviderRelease(dataProvider);

unsigned char *outputImage = (unsigned char *)malloc(width*height*2);
CGContextRef targetContext = CGBitmapContextCreate( outputImage,
                                                    width, height,
                                                    4, width*2,
                                                    colourSpace, kCGImageAlphaPremultipliedLast | kCGBitmapByteOrder32Big);
CGContextDrawImage(targetContext, CGRectMake(0, 0, width, height), inputImage);

/* uplopad outputImage to OpenGL here! */

CGContextRelease(targetContext);
CGImageRelease(inputImage);
CGColorSpaceRelease(colourSpace);
free(outputImage);

However, per the documentation:

Supported pixel formats are kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange, kCVPixelFormatType_420YpCbCr8BiPlanarFullRange and kCVPixelFormatType_32BGRA, except on iPhone 3G, where the supported pixel formats are kCVPixelFormatType_422YpCbCr8 and kCVPixelFormatType_32BGRA.

So for reducing the size of what you receive, you could switch to a YCbCr colour space. As the buffers come back biplanar (ie, all y components for the entire image, then all Cb and Cr components as a separate block), you can upload them as two individual textures to OpenGL and recombine in a shader, assuming you're happy limiting yourself to the 3GS and above, and can afford to spend 2 texture units of the 8 that are available on SGX iOS devices.

YCbCr is a colour space that represents colour as brightness (the Y) and colour (the CbCr) separately. It's been shown empirically that the colour channel can be sampled at a lower frequency than the brightness without anybody being able to tell. The '420' part of the pixel format describes how many Cb and Cr components you get for each 4 Y components — essentially it's telling you that you get one sample of Cb and one of Cr for every four samples of Y. Hence you have a total of six bytes to describe four pixels, for 12 bits/pixel rather than 24 bits/pixel in RGB. That saves you 50% of your storage.

For GL purposes you've potentially incurred an extra cost because it's two uploads rather than one. You're also going to need to use three varyings if you want to avoid dependent texture reads, and I think the SGX is limited to eight of those.

继续阅读：avfoundation neon opengl-es

How to convert to 16 bit pixel format for use in OpenGL ES 2.0 when reading frames from a video (AV Foundation)

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？