How to read an MP3 file, separating metadata from audio?
I understand that the MP3 file format essentially consists of two segments, id3 metadata+audio frames. How can I read in binary form, all of the id3 segment and all of the audio frames as two binary blobs? I'm looking to simply perform a hash calculation on the metadata and the audio as two separate units in a file. How can I determ开发者_如何学Cine where the "split point" is in the file?
From the ID3 tag specification:
+-----------------------------+
| Header (10 bytes) |
+-----------------------------+
| Extended Header |
| (variable length, OPTIONAL) |
+-----------------------------+
| Frames (variable length) |
+-----------------------------+
| Padding |
| (variable length, OPTIONAL) |
+-----------------------------+
| Footer (10 bytes, OPTIONAL) |
+-----------------------------+
Note that there are several ID3 tag versions out there.
Specification: http://www.id3.org/id3v2.4.0-structure
There are usually zero, one, or two metadata chunks.
At the beginning of the file there may be an optional ID3 version 2 metadata chunk, which comes in three subversions. This ID3v2 always has a variable length which is encoded in the header, though it's encoded slightly differently depending on the subversion.
Then you have the audio frames. There is a variable number of them. There is no header telling how many there will be or where in the file they end.
Then at the end of the file there may be an optional ID3 version 1 metadta chunk, which has a fixed length of 128 bytes and begins with a 3-byte magic word.
Rarely, an ID3v2 tag might be at the end of the file or even in the middle.
Also there are rare extensions which may add extra stuff to the ID3v1 tag making it longer.
You can iterate through all the "frames" in an MP3 file. Each frame begins with three bytes that can be used to tell whether the frame is an ID3v2 "tag", an MP3 audio frame, or an ID3v1 tag.
Note that errors or corruption are not rare in the audio frames. These frames start with 0xFFFFFF, called the "synch" pattern, and you have to use the other bytes and bits in the frame to both do a sanity check and calculate the length of the frame.
When a frame doesn't begin with the synch pattern, an ID3 tag magic word, or fails the sanity check, you should ignore bytes until you find the next 0xFFFFFF synch pattern.
So you can take some shortcuts which will work most of the time or iterate through the whole file, which can be slow. Also I'm not really an expert so there's likely to be things I've left out due to ignorance. In particular I think that though there are mechanisms to make sure there are no false synch patterns embedded in the metadata, I believe that sometimes they still occur.
Hope this helps for any new people coming here via the Googles (-:
精彩评论