AVAudioRecorder doesn't write out proper WAV File Header

2023-03-11 04:13 问答作者：

I'm working on a project on the iPhone where I'm recording audio from the device mic using AVAudioRecorder, and then will be manipulating the recording.

To ensure that I'm reading in the samples from the file correctly, I'm using python's wave module to see if it returns the same samples.

However, python's wave module returns "fmt chunk and/or data chunk missing" when trying to open the wav file that is saved by AVA开发者_StackOverflow社区udioRecorder.

These are the settings I am using to record the file:

[audioSettings setObject:[NSNumber numberWithInt:kAudioFormatLinearPCM] forKey:AVFormatIDKey];
[audioSettings setObject:[NSNumber numberWithInt:16] forKey:AVLinearPCMBitDepthKey];
[audioSettings setObject:[NSNumber numberWithBool:NO] forKey:AVLinearPCMIsBigEndianKey];
[audioSettings setObject:[NSNumber numberWithFloat:4096] forKey:AVSampleRateKey];
[audioSettings setObject:[NSNumber numberWithInt:1] forKey:AVNumberOfChannelsKey];
[audioSettings setObject:[NSNumber numberWithBool:YES] forKey:AVLinearPCMIsNonInterleaved];
[audioSettings setObject:[NSNumber numberWithBool:NO] forKey:AVLinearPCMIsFloatKey];

After that, I'm just making a call to recordForDuration to actually do the recording.

The recording succeeds-- I can play the file etc, and I can read in the samples using AudioFile services, but I can't validate it because I can't open the file with Python's wave module.

This is what the first 128 bytes of the file look like:

1215N:~/Downloads$ od -c --read-bytes 128 testFile.wav
0000000   R   I   F   F   x   H 001  \0   W   A   V   E   f   m   t    
0000020 020  \0  \0  \0 001  \0 001  \0   @ 037  \0  \0 200   >  \0  \0
0000040 002  \0 020  \0   F   L   L   R 314 017  \0  \0  \0  \0  \0  \0
0000060  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
0000200

Any idea what I need to do to make sure a correct WAV header is written out by AVAudioRecorder?

Apple software often creates WAVE files with a non-standard (but "spec" conformant) "FLLR" subchunk after the "fmt " subchunk and before the "data" subchunk. I assume "FLLR" stands for "filler", and I assume the purpose of the subchunk is to enable some sort of data alignment optimization. The subchunk is usually about 4000 bytes long, but its actual length can vary depending on the length of the data preceding it.

Adding arbitrary subchunks to WAVE files is generally considered spec-conformant because WAVE is a subset of RIFF, and the common practice in RIFF file processing is to ignore chunks and subchunks which have an unrecognized identifier. The identifier "FLLR" is "non-standard" and so should be ignored by any software which encounters it.

There is a fair amount of software out there that treats the WAVE format much more rigidly than it ought to, and I suspect the library you're using may be one of those pieces of software. For example, I have seen software that assumes that the audio bytes always begin at offset 44 -- this is an incorrect assumption.

In fact, finding the audio bytes in a WAVE file must be done by finding the location and size of the "data" subchunk within the RIFF; this is the correct way to locate the audio bytes within a WAVE file.

Reading WAVE files properly must really begin as an exercise in locating and identifying RIFF subchunks. RIFF subchunks have an 8-byte header: 4 bytes for an identifier/name field which is traditionally filled with human-readable ASCII characters (e.g. "fmt "), and a 4-byte little-endian unsigned integer specifying the number of bytes in the subchunk's data payload -- the subchunk's data payload follows immediately after its 8-byte header.

The WAVE file format reserves certain subchunk identifiers (or "names") as being meaningful to the WAVE format. There are a minimum of two subchunks that must always appear in every WAVE file:

"fmt " - the subchunk with this identifier has a payload which describes the basic information about the audio's format: sample rate, bit depth, etc.
"data" - the subchunk with this identifier has the actual audio bytes in its payload

"fact" is the next most common subchunk identifier. It is usually found in WAVE files that use a compressed codec, such as μ-law. See this enthusiast webpage for more information about some of the various subchunk identifiers in use today in the wild, and information about their payload structure.

From a purely RIFF perspective, subchunks need not appear in any particular order in the file, or at any particular fixed offset. In practice however, almost all software expects the "fmt " subchunk to be the first subchunk. This is a concession to practicality: it is convenient to know early in the data stream what format of audio the WAVE contains -- this makes it easier to play a wave file from a network stream, for example. If the WAVE file uses a compressed format, such as μ-law, it is usually assumed that the "fact" subchunk will appear directly after "fmt ".

After the format-specifying chunks are out of the way, assumptions about the location, ordering, and naming of subchunks should be abandoned. At this point, the software should locate expected subchunks by name only (e.g. "data"). If subchunks are encountered that have unrecognized names (e.g. "FLLR"), those subchunks should simply be skipped over and ignored. Skipping a subchunk requires reading its length so that you can skip over the correct number of bytes.

What Apple has done with the "FLLR" subchunk is slightly unusual, and I'm not surprised that some software is tripped up by it. I suspect that the library you are using is simply unprepared to deal with the presence of the "FLLR" subchunk. I would consider this a defect in the library. The mistake the library authors have made is probably something like:

They may be expecting the "data" subchunk to appear within the first N bytes of the beginning of the file, where N is something less than ~4kB. They may give up looking if they have to scan too far into the file. The Apple "FLLR" subchunk pushes the "data" subchunk to a position >~4kB into the file.
They may be expecting the "data" subchunk to have a specific ordinal subchunk position or byte offset within the RIFF. Perhaps they expect "data" to appear immediately after "fmt ". This is an incorrect way to process a RIFF file, though. The ordinal position and/or offset position of the "data" subchunk should not be assumed.

As long as we're talking about correct WAVE file processing, I might as well remind everyone that the audio bytes (the data subchunk's payload) may not run exactly to the end of the file. It is allowable to insert subchunks after the data payload. Some programs use this to store a textual "comment" field at the end of the file. If you read blindly from the start of the data payload until the EOF, you may pull in some metadata subchunks as audio, which will sounds like a "click" at the end of playback. You need to honor the length field of the data subchunk and stop reading audio once you've consumed the entire data payload -- not stop when you hit EOF.

What's the name of the file you're recording to on disk? I had a similar problem and just solved it by tacking on .wav to the end of my filename... I guess AVAudioRecorder needs an extension to figure things out.

继续阅读：avaudiorecorder python

AVAudioRecorder doesn't write out proper WAV File Header

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？