开发者

How can I detect a user's input language using Ruby without using an online service?

I'm looking for a library or technique to detect the input language of blocks of text provided by 开发者_JAVA百科users. Online lookups (like Google translate) won't work for this task as I'm writing an app which must run offline.

Thanks.


Here are two more n-gram-based gems you might want to try. They work offline.

  • https://github.com/echen/unsupervised-language-identification, optimized for separating english and other languages (has a live demo)
  • https://github.com/feedbackmine/language_detector, less specialized, will detect more languages. Some languages may need some extra training — I found it to be not precise enough for German text.


For anyone interested, I've found http://rubygems.org/gems/kenwaln-whatlanguage, which is performing excellently.


I'm using CLD which I really like, succinct and easy to use. Give it a try.


A quick demo of WhatLanguage in Ruby:

http://www.youtube.com/watch?v=lNqZ2cqOReo&list=UUJ_3fstMOH-g4yBxtvgAWkw&index=0&feature=plcp

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜