How can I detect a user's input language using Ruby without using an online service?
I'm looking for a library or technique to detect the input language of blocks of text provided by 开发者_JAVA百科users. Online lookups (like Google translate) won't work for this task as I'm writing an app which must run offline.
Thanks.
Here are two more n-gram-based gems you might want to try. They work offline.
- https://github.com/echen/unsupervised-language-identification, optimized for separating english and other languages (has a live demo)
- https://github.com/feedbackmine/language_detector, less specialized, will detect more languages. Some languages may need some extra training — I found it to be not precise enough for German text.
For anyone interested, I've found http://rubygems.org/gems/kenwaln-whatlanguage, which is performing excellently.
I'm using CLD which I really like, succinct and easy to use. Give it a try.
A quick demo of WhatLanguage in Ruby:
http://www.youtube.com/watch?v=lNqZ2cqOReo&list=UUJ_3fstMOH-g4yBxtvgAWkw&index=0&feature=plcp
精彩评论