Audio Processing: Playing with volume level

2023-01-21 04:32 问答作者：

I want to read a sound file from application bundle, copy it, play with its maximum volume level(Gain value or peak power, I'm not sure about the technical name of it), and then write it as another file to the bundle again.

I did the copying and writing part. Resulting file is identical to input file. I use AudioFileReadBytes() and AudioFileWriteBytes() functions of AudioFile services in AudioToolbox framework t开发者_JAVA技巧o do that.

So, I have the input file's bytes and also its audio data format(via use of AudioFileGetProperty() with kAudioFilePropertyDataFormat) but I can't find a variable in these to play with the original file's maximum volume level.

To clarify my purpose, I'm trying to produce another sound file of which volume level is increased or decreased relative to the original one, so I don't care about the system's volume level which is set by the user or iOS.

Is that possible to do with the framework I mentioned? If not, are there any alternative suggestions?

Thanks

edit: Walking through Sam's answer regarding some audio basics, I decided to expand the question with another alternative.

Can I use AudioQueue services to record existing sound file(which is in the bundle) to another file and play with the volume level(with the help of framework) during the recording phase?

update: Here's how I'm reading the input file and writing the output. Below code lowers the sound level for "some" of the amplitude values but with lots of noise. Interestingly, if I choose 0.5 as amplitude value it increases the sound level instead of lowering it, but when I use 0.1 as amplitude value it lowers the sound. Both cases involve disturbing noise. I think that's why Art is talking about normalization, but I've no idea about normalization.

AudioFileID inFileID;

CFURLRef inURL = [self inSoundURL];

AudioFileOpenURL(inURL, kAudioFileReadPermission, kAudioFileWAVEType, &inFileID)

UInt32 fileSize = [self audioFileSize:inFileID];
Float32 *inData = malloc(fileSize * sizeof(Float32)); //I used Float32 type with jv42's suggestion
AudioFileReadBytes(inFileID, false, 0, &fileSize, inData);

Float32 *outData = malloc(fileSize * sizeof(Float32));

//Art's suggestion, if I've correctly understood him

float ampScale = 0.5f; //this will reduce the 'volume' by -6db
for (int i = 0; i < fileSize; i++) {
    outData[i] = (Float32)(inData[i] * ampScale);
}

AudioStreamBasicDescription outDataFormat = {0};
[self audioDataFormat:inFileID];

AudioFileID outFileID;

CFURLRef outURL = [self outSoundURL];
AudioFileCreateWithURL(outURL, kAudioFileWAVEType, &outDataFormat, kAudioFileFlags_EraseFile, &outFileID)

AudioFileWriteBytes(outFileID, false, 0, &fileSize, outData);

AudioFileClose(outFileID);
AudioFileClose(inFileID);

You won't find amplitude scaling operations in (Ext)AudioFile, because it's about the simplest DSP you can do.

Let's assume you use ExtAudioFile to convert whatever you read into 32-bit floats. To change the amplitude, you simply multiply:

float ampScale = 0.5f; //this will reduce the 'volume' by -6db
for (int ii=0; ii<numSamples; ++ii) {
    *sampOut = *sampIn * ampScale;
    sampOut++; sampIn++;
}

To increase the gain, you simply use a scale > 1.f. For example, an ampScale of 2.f would give you +6dB of gain.

If you want to normalize, you have to make two passes over the audio: One to determine the sample with the greatest amplitude. Then another to actually apply your computed gain.

Using AudioQueue services just to get access to the volume property is serious, serious overkill.

UPDATE:

In your updated code, you're multiplying each byte by 0.5 instead of each sample. Here's a quick-and-dirty fix for your code, but see my notes below. I wouldn't do what you're doing.

...

// create short pointers to our byte data
int16_t *inDataShort = (int16_t *)inData;
int16_t *outDataShort = (int16_t *)inData;

int16_t ampScale = 2;
for (int i = 0; i < fileSize; i++) {
    outDataShort[i] = inDataShort[i] / ampScale;
}

...

Of course, this isn't the best way to do things: It assumes your file is little-endian 16-bit signed linear PCM. (Most WAV files are, but not AIFF, m4a, mp3, etc.) I'd use the ExtAudioFile API instead of the AudioFile API as this will convert any format you're reading into whatever format you want to work with in code. Usually the simplest thing to do is read your samples in as 32-bit float. Here's an example of your code using ExtAudioAPI to handle any input file format, including stereo v. mono

void ScaleAudioFileAmplitude(NSURL *theURL, float ampScale) {
    OSStatus err = noErr;

    ExtAudioFileRef audiofile;
    ExtAudioFileOpenURL((CFURLRef)theURL, &audiofile);
    assert(audiofile);

    // get some info about the file's format.
    AudioStreamBasicDescription fileFormat;
    UInt32 size = sizeof(fileFormat);
    err = ExtAudioFileGetProperty(audiofile, kExtAudioFileProperty_FileDataFormat, &size, &fileFormat);

    // we'll need to know what type of file it is later when we write 
    AudioFileID aFile;
    size = sizeof(aFile);
    err = ExtAudioFileGetProperty(audiofile, kExtAudioFileProperty_AudioFile, &size, &aFile);
    AudioFileTypeID fileType;
    size = sizeof(fileType);
    err = AudioFileGetProperty(aFile, kAudioFilePropertyFileFormat, &size, &fileType);


    // tell the ExtAudioFile API what format we want samples back in
    AudioStreamBasicDescription clientFormat;
    bzero(&clientFormat, sizeof(clientFormat));
    clientFormat.mChannelsPerFrame = fileFormat.mChannelsPerFrame;
    clientFormat.mBytesPerFrame = 4;
    clientFormat.mBytesPerPacket = clientFormat.mBytesPerFrame;
    clientFormat.mFramesPerPacket = 1;
    clientFormat.mBitsPerChannel = 32;
    clientFormat.mFormatID = kAudioFormatLinearPCM;
    clientFormat.mSampleRate = fileFormat.mSampleRate;
    clientFormat.mFormatFlags = kLinearPCMFormatFlagIsFloat | kAudioFormatFlagIsNonInterleaved;
    err = ExtAudioFileSetProperty(audiofile, kExtAudioFileProperty_ClientDataFormat, sizeof(clientFormat), &clientFormat);

    // find out how many frames we need to read
    SInt64 numFrames = 0;
    size = sizeof(numFrames);
    err = ExtAudioFileGetProperty(audiofile, kExtAudioFileProperty_FileLengthFrames, &size, &numFrames);

    // create the buffers for reading in data
    AudioBufferList *bufferList = malloc(sizeof(AudioBufferList) + sizeof(AudioBuffer) * (clientFormat.mChannelsPerFrame - 1));
    bufferList->mNumberBuffers = clientFormat.mChannelsPerFrame;
    for (int ii=0; ii < bufferList->mNumberBuffers; ++ii) {
        bufferList->mBuffers[ii].mDataByteSize = sizeof(float) * numFrames;
        bufferList->mBuffers[ii].mNumberChannels = 1;
        bufferList->mBuffers[ii].mData = malloc(bufferList->mBuffers[ii].mDataByteSize);
    }

    // read in the data
    UInt32 rFrames = (UInt32)numFrames;
    err = ExtAudioFileRead(audiofile, &rFrames, bufferList);

    // close the file
    err = ExtAudioFileDispose(audiofile);

    // process the audio
    for (int ii=0; ii < bufferList->mNumberBuffers; ++ii) {
        float *fBuf = (float *)bufferList->mBuffers[ii].mData;
        for (int jj=0; jj < rFrames; ++jj) {
            *fBuf = *fBuf * ampScale;
            fBuf++;
        }
    }

    // open the file for writing
    err = ExtAudioFileCreateWithURL((CFURLRef)theURL, fileType, &fileFormat, NULL, kAudioFileFlags_EraseFile, &audiofile);

    // tell the ExtAudioFile API what format we'll be sending samples in
    err = ExtAudioFileSetProperty(audiofile, kExtAudioFileProperty_ClientDataFormat, sizeof(clientFormat), &clientFormat);

    // write the data
    err = ExtAudioFileWrite(audiofile, rFrames, bufferList);

    // close the file
    ExtAudioFileDispose(audiofile);

    // destroy the buffers
    for (int ii=0; ii < bufferList->mNumberBuffers; ++ii) {
        free(bufferList->mBuffers[ii].mData);
    }
    free(bufferList);
    bufferList = NULL;

}

I think you should avoid working with 8 bits unsigned chars for audio, if you can. Try to get the data as 16 bits or 32 bits, that would avoid some noise/bad quality issues.

For most common audio file formats there isn't a single master volume variable. Instead you will need to take (or convert to) the PCM sound samples and perform at least some minimal digital signal processing (multiply, saturate/limit/AGC, quantization noise shaping, and etc.) on each sample.

If the sound file is normalized, there's nothing you can do to make the file louder. Except in the case of poorly encoded audio, volume is almost entirely the realm of the playback engine.

http://en.wikipedia.org/wiki/Audio_bit_depth

Properly stored audio files will have peak volume at or near the maximum value available for the file's bit depth. If you attempt to 'decrease the volume' of a sound file, you'll essentially just be degrading the sound quality.

继续阅读：audio audioqueueservices audiotoolbox volume

Audio Processing: Playing with volume level

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？