How can I append to a recorded MPEG4 AAC file?

2023-02-14 06:35 问答作者：

I'm recording audio on an iPhone, using an AVAudioRecorder with the following settings:

NSMutableDictionary *recordSettings = [[NSDictionary alloc] initWithObjectsAndKeys:
       [NSNumber numberWithInt: kAudioFormatMPEG4AAC], AVFormatIDKey,
       [NSNumber numberWithFloat:44100.0], AVSampleRateKey,
       [NSNumber numberWithInt:1], AVNumberOfChannelsKey,
       [NSNumber numberWithInt:12800], AVEncoderBitRateKey,
       [NSNumber numberWithInt:16], AVLinearPCMBitDepthKey,
       [NSNumber numberWithInt: AVAudioQualityHigh],  AVEncoderAudioQualityKey,
       nil];

(I can be flexible on most of these settings, but I have to use MPEG4 AAC.)

I save the audio to a file.

The user needs to be able to come back at a later date and continue recording to the same file. There doesn't seem to be an option to do this directly with AVAudioRecorder, so instead I'm recording to a new file and concatenating them.

At the moment I'm appending the files using an AVMutableComposition and an AVMutableCompositionTrack as here, but it's really slow for longer recordings so this isn't really feasible.

I'm thinking it would be much quicker if I could strip the header from the second file, append the audio data to the first file, then alter the header of the combined file to reflect the new duration. As I know both files were created with exactly the same settings, I 开发者_如何学Gofigure the other details in the headers should be identical.

Unfortunately I can't find any information about what format the headers are in, or if it's possible to combine files in this way.

So my questions are:

What is the format of the MPEG-4 AAC file header, when created on an iPhone?
Can I combine two audio files by messing with the headers like this?
Is there a better way of appending two MPEG-4 AAC audio files almost instantaneously?

Though we ask the AVAudioRecorder to record in MPEG4-AAC format, it always produces a .caf (Core Audio Format) file. This is just a wrapper format, however, and the actual audio data it contains is in AAC format.

In the end, appending files came down to manipulating the .caf files byte-by-byte. The spec for Core Audio Format files is here. Digesting this document and processing the files accordingly was a little off-putting at first, but it turns out the spec is very clear and complete, so it wasn't too onerous.

As the spec explains, .caf files consist of chunks with four-byte names at the beginning. For AAC files, there's always a desc chunk and a kuki chunk. As we know our two original files are in the same format, we can copy these chunks unchanged to the output file.

There's also a pakt chunk and a data chunk. We can't guarantee which order these will be in within the input files. There may or may not be a free chunk - but this just contains padding 0x00's, so we needn't copy this to the output file.

To combine the pakt chunks, we need to examine the chunk headers and produce a new pakt chunk whose mNumberPackets and mNumberValidFrames fields are the sums of those in the input files. The mPrimingFrames and mRemainderFrames are always zero - these are only relevant for streaming media. The bulk of the pakt chunks (ie. the actual packet table data) can just be concatenated.

Similarly for the data chunks: the mChunkSize fields need to be summed and then the bulk of the data can be concatenated.

Be careful when reading data from all the binary numeric fields within these files: the files are big-endian but the iPhone is little-endian.

For extra credit, you might also like to consider deleting segments of audio from within a file, or inserting one audio file into the middle of another. This is a little trickier as you have to parse the contents of the pakt chunk. Again it's a case of following the spec: there's a good description of how the packet sizes are stored in variable-length integers, so you'll have to parse these to find how many bytes each packet takes up in the data chunk, and calculate their positions accordingly.

All in all this is rather more hassle than I was hoping for. Maybe there's an open source library that will do all this for you, but I couldn't find one.

However, handling raw files like this is blinding fast compared to using AVMutableComposition and AVMutableCompositionTrack as in the original question - inserting an hour-long recording into another of the same length takes about two seconds.

Good luck!

I found a way that was much faster to implement:

Use AVAudioRecorder and use the extension "m4a" for a temporary file, you can however also use "caf" if you want but it's unnecessary.
Modify the code here to use AVAssetExportPresetPassthrough and exportSession.outputFileType = AVFileTypeQuickTimeMovie and a filename "audioJoined.mov". Use your newly recorded temporary m4a and an existing m4a file. This gives you an instant join (no recompression) and produces a "mov".

Note. Unfortunately the AVAudioPlayer cannot play a "mov" so the next step is to convert it to something playable. However, if you are just going to share the file somewhere you could potentially skip the next step since the mov is perfectly playable on a Mac in Quicktime. It also can be played in iTunes and synced back to an iPhone and plays in the iPod app.

Convert the mov back to a m4a using [[AVAssetExportSession alloc] initWithAsset:movFileAsset presetName:AVAssetExportPresetAppleM4A], @"audioJoined.m4a" for the filename and exportSession.outputFileType = AVFileTypeAppleM4A. Again, this is instant. I'm guessing that the exporter is smarter in this situation when it starts with a mov asset rather than a AVMutableComposition asset.

I'm using this technique in an app that is able to resume recording after recording has been stopped and the file has been played, or even if the app is restarted, pretty cool.

继续阅读：aac audio avaudiorecorder objective-c

How can I append to a recorded MPEG4 AAC file?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？