Compare two audio files for beat/tempo and rating in iphone

2023-02-01 00:05 问答作者：

I want to develop an iPhone application which should have the ability to count the number of phrases that are received when user sing on the mic.

This application should also have the ability to decipher whether the users phrases are in, or out, of cadence with a preset beat. When user sings on mic, Instrum开发者_StackOverflow社区ental-only music plays.

So I have to merge the users recorded voice with instrumental music -- this is one audio file. Already i have on original song file. I have to compare both and give a rating to users.

Note...Instrumental music is without vocal of Original Song file.

Can anyone please help me? Thanks, Vadivelu

First you are going to need a solution for audio segmentation and onset detection. There are a few different ways to do this, some of them have been discussed on stack overflow already. Aubio is one library that may help you with this.

The second part, merging the two sound files should be a simple summing operation between the sample buffers of the incoming microphone sound with the sample buffers of the original audio source.

Let me try to understand the application you are building.

I have an iPhone and I play Lady Gaga :P.
It plays the original song (instrumentals + vocals).
As I start singing, the app must detect that I am trying to sing the song playing.
If it does determine this, it switches to playing instrumentals only (karaoke style).
Concurrently, it records my voice. At the end of the song, it does some analysis on how well I sang.

If this is correct, let me try to take a stab at Step #4. The basic idea is that only if I am singing something close to the song being played should it switch into karaoke mode.

I would pre-compute an energy envelope of the vocal only portion of the song (the part the person is supposed to sing). To extract the vocal only portion, you might have to pay a good singer to sing it because you probably cannot extract it from the original song.

To compute the energy envelope, I would use something like half wave rectification followed by a low pass filter (definitely something causal and fast).

Then, I would listen on the microphone and in real time compute the energy envelope of the input audio.

Knowing that I am 2:00 into "Telephone", I would compare the truth energy envelope from 1:55 to 2:00 to the energy envelope of the last 5 seconds I recorded. I would normalize each envelope some way. Depending on the overlap score, I would decide whether the person was attempting to sing "Telephone" or not.

Best of luck!

Chuan

继续阅读：audio signal-processing

Compare two audio files for beat/tempo and rating in iphone

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？