Speech recognition language model

2023-01-20 03:04 问答作者：

I would like to integrate speech recognition into my Android application.

I am aware google provides two language models (free form for dictation and web search for short phrases).

However, my app will have a finite number of possible words (maybe a few th开发者_运维知识库ousand). Is it possible to specify the vocabularly; limiting it to these words, in the hope of achieving more accurate results?

My immediate thoughts would be to use the web search language model and then check the results of this against my vocabulary.

Any thoughts appreciated.

I think your intuition is correct and you've answered your own question.

The built in speech recognition provided by google only supports the dictation and search language models. See http://developer.android.com/reference/android/speech/RecognizerIntent.html

You can get back results using these recognizer models and then classify or filter the results to find what best matches your limited vocabulary. There are different techniques to do this and they can range from simple parsing to complex statistical models.

The only other alternative I've seen is to use some other speech recognition on a server that can accept your dedicated language model. Though this is costly and complex and used by commercial speech companies like VLingo or Dragon or Microsoft's Bing.

You can use Opensource models like Voxforge or cheap ones like Lumenvox. Some have been ported to android. I forgot by whom.

I answered pretty much the same question before - please check here: Building openears compatible language model

and here:

typically you need very large text corpora to generate useful language models.

If you just have a small amount of training data, your language model will be over-fitted, which means that it will not generalize.

继续阅读：android speech-recognition

Speech recognition language model

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？