开发者

TTS Speech Synthesizer development for minority Asian languages

I am wanting to put together a team to develop a TTS Speech Synthesizer for various phonetic Asian Languages. I have the language experts lined up. The final product will be 1. an android phone app,开发者_Go百科 and 2. a web-based TTS service.

Languages in order of implementation:

Mien

Hmong

Lao

The first two have Latin based orthographies.

My Question is: On the programming side, who do I need on my team? What skills/programming languages am I looking for?


Someone with:
1. A DSP (digital signal processing) background with an emphasis on speech and audio signal processing would be a good bet.
2. Of course, the person must have good to great programming skills.
3. A liking for learning new languages. Learning the language for which the TTS engine is to be developed, or at least having a rudimentary understanding of it is needed for a programmer to "know" what is being coded and maybe even improvise upon existing algorithms.

You can take a look at the FestVox and Festival pages CMU's speech recogniser and TTS Engine and see what languages they develop in. That might give you a better idea.

TTS Engines sit almost at the cross-roads of linguistic science, DSP (for the backend) and Software Engineering for implementing all the above. I think you need some DSP and Software guy(s) to complete your team.

All the best and hope that helps,
Sriram.


Are you a native speaker of any of these languages? You are absolutely going to need fluent speakers of all these languages and, quite possibly, a linguist on your team in addition to programmers with good DSP skills. The languages you've listed are all "tonal" languages. So, in addition to the usual challenges you would encounter in building a Text-To-Speech system for Romance (French, Spanish, Italian, etc.) or Germanic (English, German, etc.) languages, you will have to deal with tonality as well. In a tonal language, you can have multiple words that have essentially the same pronunciation (at least they sound the same to the ears of an untrained Westerner) and they may even have the same Latin orthography. The sole difference between them in speech is the pitch of the word relative to other words in the sentence or a change in pitch that occurs as the word is spoken. If you are unlucky and these words do have the same Latin orthography, then you have a need for someone on the team with expertise in artificial intelligence because your program will have to recognize which word is intended from its context in a sentence in order to produce the correct sound.

Good luck with your project!


The short answer is lots.

To develop a worthwhile TTS from scratch would take tens of thousands of programming hours.

An alternative is that you could work in partnership with us and we can give you an exclusive to use those languages for a period of time.

The company I work with are the worlds leading provider of mobile and automotive Text to Speech software and languages.

By using our engine, you would save years of effort and hundreds of thousands in development costs, and be able to deliver those languages within a couple of months.

Cheers, Neil

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜