How to build an accurate translation engine?
I found a formula few months ago, myself to translate any source language (computer characters) to destination (computer characters). Using Lua (desk top users) and C++ class (for native access) so that i can embed it in Web Browser etc etc. I am wondering if we have already better something for this in C++ or Lua.
Mine sometimes its really not translating grammars correctly or even rules, before building it i t开发者_JS百科hought mine would be a best way to complete, but its taking way to long now, and i am afraid it may become wrong implementation. Now i want to check out others and compare mine.
I used Google translate or others which is not my target, i was building a translator engine (like google or others), where someone can put there dictionary and create rules.
Is there any existing translation framework or libraries (OpenCOG or Moses) to do Source language to Destination ? example: Arabic to Chinese or English to Japanese ? Or What else Google/others using ?
Any suggestion would be appreciated
Thanks in advance.
I hate to discourage you, but you are trying to single-handedly solve the problem of Machine Translation. MT systems like Systran have been developed by teams of scientists and engineers for decades and they are still far from perfect.
Moses is a pretty good open source translation library for C++. cdec represents the current state of the art (but requires context-free grammars for both source and target language). Both require large amounts of training data, i.e. parallel corpora.
When you've finished, run to your university and demand a PhD.
Did you take a look at Google Translator Toolkit API? By analyzing its aspects you can have a glimpse of what it implements and what you may need to develop your own translation framework (a lot of work by the way).
Creating/Uploading translation documents
Full list of supported source and target languages
http://www.leniel.net/2010/12/playing-google-translator-toolkit-api.html
More to the stack:
Free/open-source machine translation systems and tools
GNU gettext
TinyTM - Open-Source Translation Memory
精彩评论