开发者

Large vocabulary speech recognition in iPhone without internet?

I used Openears which needs dictionary. It is usefull when we mention the word in dictionary. I wanted to convert all words we speak. So I used Nuance’s speech to recognition dragaon SDK. But it communicates with webserver. I want to avoid server communication because o开发者_开发知识库f security concerns. Is it possible to convert speech to text for all words we speak as it is in windows mobile without communicating server only in offline mode?


Speech recognition with unlimited vocabulary requires very big computational and memory resources (gigabytes of memory) and thus it's very hard to do that in iPhone on other embedded device. iPhone is 9 times slower than desktop. iPad is easier since it has more powerful CPU.

Google has put very big effort to make their engine work offline for dictation, and still it prefers to send data to the server because it is significantly more accurate.

Because of that most of the solutions running on small devices use limited vocabulary. Though this vocabulary can be large enough so you will not notice that. Usually 500-1000 words is enough to cover most practical situations. You can use OpenEars to recognize such vocabulary.

To train a language model you need texts from your domain (words and expressions). Language model training is described in CMUSphinx tutorial. To use language model you can use the following OpenEars API call:

- (void) changeLanguageModelToFile:     (NSString *)    languageModelPathAsString
withDictionary:     (NSString *)    dictionaryPathAsString 

See API reference for more details.

You can use OpenEars with such vocabulary and corresponding language model to support free form text entry for your device.


It could be done, but if you are looking for an unlimited vocabulary speech to text convertor, then it is best if the computations are done on a server. The requirements for such a system are probably too great for a system such as a smartphone. The main areas where you will have huge requirements are as follows:

  1. Dictionary to map input speech into text.
  2. Computations for speech recognition algorithms to run.

I believe this is the reason why companies like Google run their speech recognition services over a server and not on the phone.

But if the application was a limited word speech to text, then it might be worth giving it a try.

All the best!


Doesn't pocketsphinx work on iPhone without network connectivity? Aren't there some demo apps floating around like VocalKit

http://www.rajeevan.co.uk/pocketsphinx_in_iphone/ may be helpful.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜