how to convert wav file to float amplitude

2023-04-13 05:37 问答作者：

so I asked everything in the title:

I have a 开发者_如何学Pythonwav file (written by PyAudio from an input audio) and I want to convert it in float data corresponding of the sound level (amplitude) to do some fourier transformation etc...

Anyone have an idea to convert WAV data to float?

I have identified two decent ways of doing this.

Method 1: using the wavefile module

Use this method if you don't mind installing some extra libraries which involved a bit of messing around on my Mac but which was easy on my Ubuntu server.

https://github.com/vokimon/python-wavefile

import wavefile

# returns the contents of the wav file as a double precision float array
def wav_to_floats(filename = 'file1.wav'):
    w = wavefile.load(filename)
    return w[1][0]

signal = wav_to_floats(sys.argv[1])
print "read "+str(len(signal))+" frames"
print  "in the range "+str(min(signal))+" to "+str(max(signal))

Method 2: using the wave module

Use this method if you want less module install hassles.

Reads a wav file from the filesystem and converts it into floats in the range -1 to 1. It works with 16 bit files and if they are > 1 channel, will interleave the samples in the same way they are found in the file. For other bit depths, change the 'h' in the argument to struct.unpack according to the table at the bottom of this page:

https://docs.python.org/2/library/struct.html

It will not work for 24 bit files as there is no data type that is 24 bit, so there is no way to tell struct.unpack what to do.

import wave
import struct
import sys

def wav_to_floats(wave_file):
    w = wave.open(wave_file)
    astr = w.readframes(w.getnframes())
    # convert binary chunks to short 
    a = struct.unpack("%ih" % (w.getnframes()* w.getnchannels()), astr)
    a = [float(val) / pow(2, 15) for val in a]
    return a

# read the wav file specified as first command line arg
signal = wav_to_floats(sys.argv[1])
print "read "+str(len(signal))+" frames"
print  "in the range "+str(min(signal))+" to "+str(max(signal))

I spent hours trying to find the answer to this. The solution turns out to be really simple: struct.unpack is what you're looking for. The final code will look something like this:

rawdata=stream.read()                  # The raw PCM data in need of conversion
from struct import unpack              # Import unpack -- this is what does the conversion
npts=len(rawdata)                      # Number of data points to be converted
formatstr='%ih' % npts                 # The format to convert the data; use '%iB' for unsigned PCM
int_data=unpack(formatstr,rawdata)     # Convert from raw PCM to integer tuple

Most of the credit goes to Interpreting WAV Data. The only trick is getting the format right for unpack: it has to be the right number of bytes and the right format (signed or unsigned).

Most wave files are in PCM 16-bit integer format.

What you will want to:

Parse the header to known which format it is (check the link from Xophmeister)
Read the data, take the integer values and convert them to float

Integer values range from -32768 to 32767, and you need to convert to values from -1.0 to 1.0 in floating points.

I don't have the code in python, however in C++, here is a code excerpt if the PCM data is 16-bit integer, and convert it to float (32-bit):

short* pBuffer = (short*)pReadBuffer;

const float ONEOVERSHORTMAX = 3.0517578125e-5f; // 1/32768 
unsigned int uFrameRead = dwRead / m_fmt.Format.nBlockAlign;

for ( unsigned int i = 0; i < uFrameCount * m_fmt.Format.nChannels; ++i )
{
    short i16In = pBuffer[i];
    out_pBuffer[i] = (float)i16In * ONEOVERSHORTMAX;
}

Be careful with stereo files, as the stereo PCM data in wave files is interleaved, meaning the data looks like LRLRLRLRLRLRLRLR (instead of LLLLLLLLRRRRRRRR). You may or may not need to de-interleave depending what you do with the data.

This version reads a wav file from the filesystem and converts it into floats in the range -1 to 1. It works with files of all sample widths and it will interleave the samples in the same way they are found in the file.

import wave

def read_wav_file(filename):
    def get_int(bytes_obj):
        an_int = int.from_bytes(bytes_obj, 'little',  signed=sampwidth!=1)
        return an_int - 128 * (sampwidth == 1)
    with wave.open(filename, 'rb') as file:
        sampwidth = file.getsampwidth()
        frames = file.readframes(-1)
    bytes_samples = (frames[i : i+sampwidth] for i in range(0, len(frames), sampwidth))
    return [get_int(b) / pow(2, sampwidth * 8 - 1) for b in bytes_samples]

Also here is a link to the function that converts floats back to ints and writes them to desired wav file:

https://gto76.github.io/python-cheatsheet/#writefloatsamplestowavfile

The Microsoft WAVE format is fairly well documented. See https://ccrma.stanford.edu/courses/422/projects/WaveFormat/ for example. It wouldn't take much to write a file parser to open and interpret the data to get the information you require... That said, it's almost certainly been done before, so I'm sure someone will give an "easier" answer ;)

继续阅读：audio pyaudio python

how to convert wav file to float amplitude

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？