开发者

How to detect generation loss of a transcoded audio

Lets say you have a 96 kbit mp3 and you Transcode the file into a 320 kbit mp3. How could you programmatically detect the original bit rate or quality? Generation loss is created because each time a lossy algorithm is applied new information will be deemed "unnecessary" and is discarded. How could an algorithm use this property to detect the transcoding of audio.

128 kbps LAME mp3 transcoded to 320 kbps LAME mp3 (I Feel You, Depeche Mode) 10.8 MB.

How to detect generation loss of a transcoded audio

This image was taken from the bottom of this site. The 2 tracks above look nearly identical, but the differenc开发者_开发问答e is enough to support this argument.


One way to do it is to analyze spectrum of the signal. I'm not sure if it's possible to determine the exact original rate, but you can definitely tell between a real 320 kbps mp3 and the transcoded 96 -> 320 kbps. The 96 kbps mp3 will have higher frequencies cut at 15 kHz or so. The 320 kbps should have non-zero at around 18-20 kHz or even higher (that depends on the encoder).


The bit rate is stored in the MPEG frame header. Unless you store the original bit rate with something like ID3, then no easy way.

EDIT: Updated the answer, looks like I misunderstood th original question.


If you're transcoding by converting the original MP3 to an uncompressed format (like WAV) and then re-encoding to MP3 at the higher bitrate, then it would be impossible to determine the original file's bitrate given only the converted file. I suppose this process might produce some incredibly subtle audio artifacts that could be analyzed statistically, but this would be a pretty herculean effort, in my opinion, and unlikely to succeed.

I'm not sure if it's even possible to up-rate an MP3 without decoding and reencoding, but even if it is possible, the process still would not preserve the original bitrate in the new file. Again, this process may produce some kind of weird, measurable artifacts that might hint at the original bitrate, but I doubt it.

Update: now that I think about it, it might be possible somehow to detect this, although I have no idea how to do it programmatically. The human ear can make distinctions like this (some of them, anyway): I can tell the difference clearly between 128k MP3s and 196k MP3s, so discriminating between 96k and 320k would be a piece of cake. A 96k MP3 that had been upcoded would still have all the audio artifacts present in the 96k version (plus new ones, unfortunately).

I don't know how you would go about determining this with code, however. If I had to make this work, I'd train pigeons to do it (and I'm not kidding about that).


The difference that you see in the spectral display is probably mostly due to quantization error. If you max out the bit depth (resolution) on the lower bitrate audio file, and keep that bit depth when you upconvert (oversample) it, the spectral displays should match more closely. The encoder also probably used some dithering to avoid audio artifacts due to the quantization errors.

If the bit depth were already maxed out at the lower bitrate, then added points will be obvious and you'll see some jagged edges in the waveform. Otherwise, given sufficient bit depth, you won't be able to determine which points were original and which were added. This is especially true of higher end upconverters that will use curves to project the new points instead of simply plotting the new points evenly between the existing ones.

By definition, the sample rate determines the possible frequency range, so this is going to be your best bet in determining the original bitrate, as Igor suggested.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜