开发者

How to use cepstral?

Recently I asked this question: How to get the fundamental frequency from FFT? (you don't actually need to read it)

My doubt right now it: how to use the cepstral algorithm?

I just don't know how to use it because the only language that I know is ActionScript 3, and for this reason I have few references about the native functions found in C, Java and so on, and how I should implement them on AS. Most articles are about these languages =/ (althought, answers in other languages than AS are welcome, just explain how the script works please)

The articles I found about cepstral to find the fundamental frequency of a FFT result told me that I should do this:

signal → FT → abs() → square → log → FT → abs() → square → power cepstrum

mathematically: |F{log(|F{f(t)}|²)}|²

Important info:

  • I am developing a GUITAR TUNER in flash
  • This is the first time I am dealing with advanced sound
  • I am using an FFT to extract frequency bins from the signal that reaches user's microphone, but I got stuck in getting the fundamental frequency from it

I don't know:

  • How to apply a square in an ARRAY (I mean, the data that my FFT gives me is an array. Should I multiply it by itself? ActionScript's debug throws errors when I try to fftResults * fftResul开发者_开发技巧ts)
  • How to apply the "log". I would not know how to apply it even if I had a single number.
  • What is the difference between complex cepstral and power cepstral. Also, what of them should I use? I am trying to develop a guitar tuner.

Thanks!


Note that the output of an FFT is an array of complex values, i.e. each bin = re + j*im. I think you can just combine the abs and square operations and calculate re*re + im*im for each bin. This gives you a single positive value for each bin, and obviously you can calculate the log value for each bin quite easily. You then need to do a second FFT on this log squared data and again using the output of this second FFT you will calculate re*re + im*im for each bin. You will then have an array of postive values which will have one or more peaks representing the fundamental frequency or frequencies of your input.


The autocorrelation is the easiest and most logical approach, and the best place to start.

To get this working, start with a simple autocorrelation, and then, if necessary, improve it following the outline provided by YIN. (YIN is based on the autocorrelation with refinements. But whether or not you'll need these refinements depends on details of your situation.) This way also, you can learn as you go rather than trying to understand the whole thing in one shot.

Although FFT approaches can also work, they are a bit more confusing. The issue is that what you are really after is the period, and this isn't well represented by the FFT. The missing fundamental is a good example of this, where if you have 2Hz and 3Hz, the fundamental is 1Hz, but is nowhere in the FFT, while 1Hz is obvious in a time based representation (e.g. the autocorrelation). Add to this that overtones aren't necessarily harmonic, and noise, etc... and all of these issues make it usually best to start with a direct approach to the problem.


There are many ways of finding fundamental frequency (F0).

For languages like Java etc there are many libraries with those type of algorithms already implemented (you can study their sources).

  • MFCC (based on cepstral) implemented in Comirva (Open source).
  • Audacity (beta version!) (Open source) presents cepstrum, autocorellation, enhanced autocorellation,
  • Yin based on autocorrelation (example )
  • Finding max signal values after FFT

All these algorithms may be be very helpful for you. However easiest way to get F0 (one value in Hz) would be to use Yin.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜