开发者

Python find audio frequency and amplitude over time

Here is what I would like to do. I would like to find the audio frequency and amplitude of a .wav file at every say 1ms of that .wav file and save it into a file. I have graphed frequency vs amplitude and have graphed amplitude over time but I cannot figure out frequency overtime. My end goal is to be able to read the file and use them amplitude to adjust variables and the frequency to trigger which variabl开发者_高级运维es are being used, that seems to be the easy part. I have been using numpy, audiolab, matplotlib, etc... using FFT's but I just cannot figure this one out, any help is appreciated! Thank You!


Use a STFT with overlapping windows to estimate the spectrogram. To save yourself the trouble of rolling your own, you can use the specgram method of Matplotlib's mlab. It's important to use a small enough window for which the audio is approximately stationary, and the buffer size should be a power of 2 to efficiently use a common radix-2 fft. 512 samples (about 10.67 ms at 48 ksps; or 93.75 Hz per bin) should suffice. For a sampling rate of 48 ksps, overlap by 464 samples to evaluate a sliding window at every 1 ms (i.e. shift by 48 samples).

Edit:

Here's an example that uses mlab.specgram on an 8-second signal that has 1 tone per second from 2 kHz up to 16 kHz. Note the response at the transients. I've zoomed in at 4 seconds to show the response in more detail. The frequency shifts at precisely 4 seconds, but it takes a buffer length (512 samples; approx +/- 5 ms) for the transient to pass. This illustrates the kind of spectral/temporal smearing caused by non-stationary transitions as they pass through the buffer. Additionally, you can see that even when the signal is stationary there's the problem of spectral leakage caused by windowing the data. A Hamming window function was used to minimize the side lobes of the leakage, but this also widens the main lobe.

Python find audio frequency and amplitude over time

import numpy as np
from matplotlib import mlab, pyplot

#Python 2.x:
#from __future__ import division

Fs = 48000
N = 512
f = np.arange(1, 9) * 2000
t = np.arange(8 * Fs) / Fs 
x = np.empty(t.shape)
for i in range(8):
    x[i*Fs:(i+1)*Fs] = np.cos(2*np.pi * f[i] * t[i*Fs:(i+1)*Fs])

w = np.hamming(N)
ov = N - Fs // 1000 # e.g. 512 - 48000 // 1000 == 464
Pxx, freqs, bins = mlab.specgram(x, NFFT=N, Fs=Fs, window=w, 
                                 noverlap=ov)

#plot the spectrogram in dB

Pxx_dB = np.log10(Pxx)
pyplot.subplots_adjust(hspace=0.4)

pyplot.subplot(211)
ex1 = bins[0], bins[-1], freqs[0], freqs[-1]
pyplot.imshow(np.flipud(Pxx_dB), extent=ex1)
pyplot.axis('auto')
pyplot.axis(ex1)
pyplot.xlabel('time (s)')
pyplot.ylabel('freq (Hz)')

#zoom in at t=4s to show transient

pyplot.subplot(212)
n1, n2 = int(3.991/8*len(bins)), int(4.009/8*len(bins))
ex2 = bins[n1], bins[n2], freqs[0], freqs[-1]
pyplot.imshow(np.flipud(Pxx_dB[:,n1:n2]), extent=ex2)
pyplot.axis('auto')
pyplot.axis(ex2)
pyplot.xlabel('time (s)')
pyplot.ylabel('freq (Hz)')

pyplot.show()
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜