How to make MFCC algorithm?
I wanna make the Mel-Frequency Cepstrum Algorithm but there are some things that I don't understand.
After FTT is done we need to "Map the powers of the spectrum obtained above onto the mel scale, using triangular overlapping windows."
I know how to calculate the triangles and I also know how to pass to mel scale. I simply don't know what to do with them.
If the triangles are defined, how do I map the power of the spectrum obtained above onto the mel scale?
Is it like this: Sum the frequencies inside the triangle开发者_C百科 and then pass it to mel scale? or Sum the frequencies inside the triangle according to a weight value (defined by the height of the triangle at that point) and then pass it to mel scale? or Pass all the frequencies inside the triangle to mel scale according to the weith value? Another thing?
Can anyone clarifies this to me
I think this step of the process is a little weird and doesn't make complete sense (to me anyway). The center of the filter bands are equally spaced along the mel scale, but are triangles on the linear scale, i.e. just like the figure here.
Then calculate the weighted sum using these triangle along the linear x-axis. (In this previous step, I think that some approaches normalize by the filter-triangle's area, and some don't, and I'm honestly not sure about the final consequences here, though I suspect it may not mean much except to modify the final interpretation which are all relative comparisons anyway. One maintains total energy, and the other give equally weighted contributions per band.) Then take the log of this (which converts the overall volume factor to an offset).
Edit: To be more clear on applying the filters... Each triangle represents a separate filter, producing a separate weighted sum. If there twenty filters in your filter bank, there will be twenty triangles, and twenty weighted sums to calculate. To apply each filter, for each x-axis value multiple the filter value at that x-location by the function value at that x-location, and add this to the sum for that particular filter. Most x-axis values with have two filters that are present there, so at each x-location makes a contribution to two filters.
精彩评论