Library for extracting words (speech) out from audio stream?
I have an audio stream and I would extract words (speech) from it. So for example having audio.wav I would get 001.wav, 002.wav, 开发者_StackOverflow社区003.wav, etc where each XXX.wav is one word.
I am looking for a library or program to do it -- platform does not matter, but I prefer open-source solution.
Thank you in advance for help.
Nuance, the company that makes Dragon Naturally Speaking, has a number of Software Development Kits.
The Audio Mining kit seems to match your requirements:
Dragon NaturallySpeaking SDK AudioMining is a speaker-independent speech recognition toolkit that enables the indexing of 100% of the speech information within audio files. The technology uses highly accurate speech recognition to turn audio files into XML text with timestamp information. This can be integrated with standard text-search products to enable rapid access to specific audio content.
The speech to speech+metadata is far and away the hardest part to get right. Once you have the speech + metadata, extracting the words as individual audio files is much more straightforward.
精彩评论