How to Seek Audio/Video Data with Variable Bitrate (VBR)?
This might be a way too generic question, but what is the general approach for seeking within media files (video or audio of any kind/format) if the data has variable bitrate (VBR)?
It seems an easy thing to do if the stream has a constant bitrate (CBR). E.g. if you know it is 256 kbit/s and you w开发者_如何学编程ant to seek forward/backward by 30 seconds, just calculate how many bits that are (approximately), convert that to byte and seek that many byte forwards/backwards in the file. Finally keep on reading and parsing till the next header/block-start/keyframe/whatever is found and continue playback from there.
Okay, but what do you do if the bitrate is highly variable? E.g. it can be anything from 32 to 512 kbit/s and keeps changing frequently? I know that this might depend on audio/video format. Some file formats have index tables at the beginning/end you might use and some files contain pointers in the stream how many bytes to skip for skipping the next X seconds. In that case you can work with that information, however, what if the format has no such table or pointers?
The most naive approach I can think of is to just estimate the bitrate as good as possible (e.g. by looking at the average bitrate of the last couple of seconds you played back), jumping where you think it could be correct according to the estimated bitrate, and look how far you really jumped. If you jumped too much, try jumping back a bit. If you jumped too little, try jumping forward a bit. Maybe keep jumping in one direction till you jumped too far again, now reverse the direction again as well as the step size (similar to a binary search algorithm). Each time you jumped too far, you reverse the direction and decrease the step size. You will get closer and closer to the correct point and if you are close enough (below some chosen delta), just start playing again (after all the jump doesn't have to be accurate to the millisecond).
While the algorithm above might work, it sounds rather poor and is probably very slow in practice. So how is it really done? Anyone ever wrote a media player/player-plugin of some kind? Or is it just like this that every "decent" format supporting VBR must have some kind of index tables or skip pointers in the stream if it expects software to correctly seek and not just play from start to end?
This is exactly why (for example) DVDs use VOB files instead of raw bitstreams. With a VOB file you get not only the bitstream itself, but also pointers to consecutive frames, so you can skip to another frame quickly and easily.
OTOH, most video bit streams are designed so it's fairly easy to synchronize and find the beginning of a frame (necessary for things like broadcast TV), so estimating (and rounding down), then searching for a frame start works reasonably well also. Many also have constraints on the maximum compression rate that's acceptable, so careful estimation can give you reasonably close to the beginning of a frame.
The binary search algorithm that you describe is more or less how seeking in Ogg Vorbis files work. I've never seen another format use it, most use some kind of index structure for seeking.
You can store a coarse index to your file at the beginning, with the location of a fixed number of seek points within the file. At each such seek point, you can store a sub-index of seek points within that section of the file, and so on recursively. You can look at this as a kind of B+ tree, but with nodes stored immediately before the data they index.
I don't know if anyone actually does that, but it should be much faster than the binary search you described, especially if each seek is mechanical and therefore slow.
精彩评论