开发者

how to lengthen the pause between the words with text-to-speech (pyTTS or SAPI5)

Is it possible to extend the gap between spoken words when using text to speech with SAPI5 ?

The problem is that esp. with some voices, the words are almost connected to each other, which makes the speech more difficult to understand.

I'm using python and pyTTS module (on windows, since it's using SAPI)

I tried to hook to the OnWord event and add a time.sleep() or tts.Pause(), but apparently even though all the events are caught, they are being processed only at the end of the spoken text, whether i'm using the sync or async flag.

In this NON WORKING example, the sleep() m开发者_开发技巧ethod is executed only after the sentence is spoken:

tts = pyTTS.Create()
def f(x):
    tts.Pause()
    sleep(0.5)
    tts.Resume()

tts.OnWord = f
tts.Speak(text)

Edit: -- accepted solutions

The actual answers for me were either

  • saying each word in its own "speak" command, (suggested by @Lennart Regebro), or
  • replacing each space with a comma, (as mentioned by @Dawson), e.g.

    text = text.replace(" ", ",")

that sets a reasonable pause. I didn't investigate the Pause method more then i mentioned above, since' i'm happy with the accepted solutions.


Your talking about voice Rate, right? http://msdn.microsoft.com/en-us/library/ms990078.aspx

Pause() I believe, works a lot like a comma in a normal speech pattern...except you determine the length (natural or not).


I don't have any great solutions here. But:

PyTTS last release was in 2007, and there seems to be no documentation. The same people now maintains a cross-platform library, called pyttsx, which also supports SAPI. It has a words per minute setting, but no setting to increase the pause between the words. This is most likely because there is no pause between the words at all.

You can insert a long pause by making each word it's own "utterance".

engine.say('The')
engine.say('quick')
engine.say('brown')
engine.say('fox.')

instead of

engine.say('The quick brown fox."

But that probably is too long. Other than that, you probably have to wrap or subclass the SAPI driver, but I'm not 100% sure that's going to work either. People don't have pauses between words, so I'm not sure that the speech engines themselves support it.


I've done some TTS work using the .NET APIs before. There is an enum in the System.Speech.Synthesis namespace called PromptBreak, which has different values for the length of the pause/break you want: http://msdn.microsoft.com/en-us/library/system.speech.synthesis.promptbreak.aspx

No idea if/how it can be used with PyTTS, but maybe it's a starting point.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜