Extract human sound from a wav file using java
I am working on a project where I have to extract the human sound from a audio .wav file using java.
The audio .wav file may have 3 to 4 sounds like dog, cat, music and human. I will have to identify the human sound then exatract that part from the audio .wav file.
I am using FFT.java and Complex.java.
Now I have written an AudioFileReader class which reads the audio.wav file from the h开发者_高级运维ard-drive and then convert this to bytes array. Then used the above mentioned FFT.java and Complex.java to apply FFT.fft(bytesArray), which gives me Complex array in return;
Now the problem is how to extract the human sound byte pattern from the returned Complex array... does anyone know how I might be able to achieve this?
Edit: We are assuming a very simple audio.wav file. For example, cat sound then silence, human sound then silence, dog sound then silence etc. No mixture of voices.
I think the standard way to handle problems like this are to convert the input signals into a Cepstrum or Mel-Cepstrum representation and then use the coefficients for the feature space for input into a classifier. There are many research papers that discuss solutions to these sorts of problems based on this basic approach, for example:
http://www.ics.forth.gr/netlab/data/J17.pdf
One possible shortcut you might try would be to put the input signals through a low bit-rate vocoder such as AMBE, then decode, and compare the quality of the original signal to the encoded/decoded signal. These vocoders are designed to highly compress human speech with fair to good quality at the expense of not being able to adequately represent non-speech sounds.
This can be achieved by AI (and little short of that). You might investigate APIs for speech recognition, but I doubt their ability to support signals with noise in the background.
E.G.
- Is that a cat, or someone saying 'meow'?
- Is that music, or someone singing 'do, re, mi..'?
- Who said 'Polly wanna cracker', the human or the parrot?
Well that's a classic AI problem (machine learning/pattern recognition) Have a look at the Wikipedia article
But basically you'll need already classified data that you feed into your algorithm so that it can learn how to classify new data. But beware, 100% correctness is something that's illusive for almost anything in this field, although for your simple problem it could be possible (depends on your exact definition of the problem)
精彩评论