Extract human sound from a wav file using java

2023-02-19 04:54 问答作者：

I am working on a project where I have to extract the human sound from a audio .wav file using java.

The audio .wav file may have 3 to 4 sounds like dog, cat, music and human. I will have to identify the human sound then exatract that part from the audio .wav file.

I am using FFT.java and Complex.java.

Now I have written an AudioFileReader class which reads the audio.wav file from the h开发者_高级运维ard-drive and then convert this to bytes array. Then used the above mentioned FFT.java and Complex.java to apply FFT.fft(bytesArray), which gives me Complex array in return;

Now the problem is how to extract the human sound byte pattern from the returned Complex array... does anyone know how I might be able to achieve this?

Edit: We are assuming a very simple audio.wav file. For example, cat sound then silence, human sound then silence, dog sound then silence etc. No mixture of voices.

I think the standard way to handle problems like this are to convert the input signals into a Cepstrum or Mel-Cepstrum representation and then use the coefficients for the feature space for input into a classifier. There are many research papers that discuss solutions to these sorts of problems based on this basic approach, for example:

http://www.ics.forth.gr/netlab/data/J17.pdf

One possible shortcut you might try would be to put the input signals through a low bit-rate vocoder such as AMBE, then decode, and compare the quality of the original signal to the encoded/decoded signal. These vocoders are designed to highly compress human speech with fair to good quality at the expense of not being able to adequately represent non-speech sounds.

This can be achieved by AI (and little short of that). You might investigate APIs for speech recognition, but I doubt their ability to support signals with noise in the background.

E.G.

Is that a cat, or someone saying 'meow'?
Is that music, or someone singing 'do, re, mi..'?
Who said 'Polly wanna cracker', the human or the parrot?

Well that's a classic AI problem (machine learning/pattern recognition) Have a look at the Wikipedia article

But basically you'll need already classified data that you feed into your algorithm so that it can learn how to classify new data. But beware, 100% correctness is something that's illusive for almost anything in this field, although for your simple problem it could be possible (depends on your exact definition of the problem)

继续阅读：algorithm javasound signals

Extract human sound from a wav file using java

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？