Language recognition in Java [closed]
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this questionIs there any language recognition open-source for Java? Found only for c/c++.
UPD:
I`m talking about human text language. Example:
Input: My name is John. Output: English.
Input: Ich heisse John. Outp开发者_JS百科ut: German.
Input: Меня зовут Джон. Output: Russian.
See what you think of the version in Apache Tika. This assumes that you want to find out what language text is in, as opposed to wanting to build a parser for a programming language.
Textcat http://textcat.sourceforge.net/ doesn't have Russian but it does handle the following:
- albanian
- danish
- dutch
- english
- finnish
- french
- german
- hungarian
- italian
- norwegian
- polish
- slovakian
- slovenian
- spanish
- swedish
There is Language Detection API which accepts text via HTTP POST and returns JSON with detected languages and scores. It can be used from Java or any other programming language.
I think ANTLR is pretty much standard.
精彩评论