Using Cepstrum for PDA
Hey, I am currently deleveloping a algorithm to decide wheather or not a frame is voiced or unvo开发者_运维技巧iced. I am trying to use the Cepstrum to discriminate between these two situations. I use MATLAB for my implementation.
I have some problems, saying something generally about the frame, but my currently implementation looks like (I'm award of the MATLAB has the function rceps, but this haven't worked for either):
ceps = abs(ifft(log10(abs(fft(frame.*window')).^2+eps)));
Can anybody give me a small demo, that will convert the frame to the power cepstrum, so a single lollipop at the pitch frequency. For instance use this code to generate the frequency.
fs = 8000;
timelength = 25e-3;
freq = 500;
k = 0:1/fs:timelength-(1/fs);
s = 0.8*sin(2*pi*freq*k);
Thanks.
According to Wikipedia, the power cepstrum is (deep breath) the magnitude squared of the Fourier transform of the log of the magnitude squared of the Fourier transform of the signal. So I think you're looking for
function c = ceps(frame, win)
c = abs(fft(log10(abs(fft(frame.*win)).^2+eps))).^2;
Note that I changed one of your variable names because WINDOW is a predefined function in the Signal Processing Toolbox.
But, ifft
and fft
only differ by a scale factor, and the outer abs
won't change the overall shape, so where's the lollipop right? See further down on the Wikipedia page.
A sinusoidal time input isn't going to give you an impulse in the cepstrum. The sine should yield an impulse in the spectrum, which will still be an impulse after the logmag operation, which will transform into a level shift in the cepstrum. To get something impulsive in the cepstrum, you need something periodic in the spectrum, which means you need something with multiple harmonic frequencies in the time domain. Consider, for instance, a square wave:
N = 1024;
h = hann(N, 'periodic');
f = 10;
x = sin(2*pi*f*((1:N)'-1)/N); %#'# to deal with SO formatting
s = 2*(x > 0) - 1; %# square wave
cx = ceps(x, h);
cs = ceps(s, h);
cs
will have your longed-for lollipop, not cx
.
There seems to always be a large component in the 0th cepstral bin. I guess this is because the logarithm operation always makes the input to the second FFT have a big level shift? Also, I don't get the idea of quefrency, I would have expected the lollipop to be at N/f
. So maybe there's still something wrong with this code, or (more likely) my understanding.
精彩评论