Speech Synthesis and Recognition in Linux

Festival speech synthesis

Software and guides

  • Debian:
    • festival festival-doc festival-freebsoft-utils gstreamer0.8-festival
    • speech-dispatcher-festival speech-dispatcher cl-speech-dispatcher
    • speech-tools speech-tools-doc speechd-el speechd-el-doc-cs  libgnome-speech3 libspeechd1
    • konq-speaker gnopernicus (screen readers)
    • gok  gok-doc (optical character recognition)
    • dasher (typing by mouse)
  • Guide: http://festvox.org/festtut/notes/festtut_3.html
  • KDE Text-to Speech system (KTTS)
  • TuxTalk (the GNU/Linux operating system for the blind)
  • PerlBox
  • Speak to me, Linux (Jan 2005 article)

To convert a document to an mp3 file:

  • Produce a clean ascii text version of the document (save in OOWriter as txt)
  • text2wave < document.txt > document.wav
  • lame -h document.wav document.mp3
Note that clitunno can perform all three steps -- especially the second is highly CPU-intensive.

Four modes possible (B, C and D require a sound card):

A. Convert an ascii text file to a wav file:

        text2wav < textfile.txt > textfile.wav

B. Kmouth (part of KDE)

Copy and paste the text to be read into kmouth, which starts festival

C. Browser reads to you with konq-speaker (there are also keyboard shortcuts)

Mark text, select Tools | Speak Web Page

D. Command-line conversion to speech:

echo You are feeling very sleepy -- do exactly as I say | festival --tts

D. Festival shell (see man festival for instructions on conf file)

You can set up a configuration file (see /home/steen/festival/ex.scm):
  • Start: festival ex.scm
    (SayText "This is a pen")
  • Set a voice:
     (voice_kal_diphone) US Male -- this is the one I have
     (voice_rab_diphone) Select voice (British Male)
     (voice_ked_diphone) Select voice (American Male)
These voices are in /usr/share/festival/voices/english -- see if you can get the others.

You can also get Italian and German.

Dasher will use festival to speak what you write.


ViaVoice speech recognition

Guide at http://volker.orcon.net.nz/linux/viavoice.html

Start at http://www-3.ibm.com/software/speech/enterprise/te_3.html

I tried (and failed) to order ViaVoice directly from IBM today, from their commerce page, which contains links to the following:

Technical Support will be provided by participation in the Discussion Group. To subscribe to the ViaVoice Dictation for Linux e-mail discussion list, send an e-mail to join-vvdictator@laser.sparklist.com and follow the directions.

Check the following website for the Java GUI code in our latest SDK

Reviewed in Linux Planet

Tcl/SMAPI project

Reviews and resources

30 Day Money Back Guarantee

We are looking for people who have used Dictation for Linux for at least 4 - 7 hours per week as a productivity application. If you want to help improve the usability of the IBM voice recognition product on Linux, just fill out our usability questionnaire. To get a copy of the questionnaire, just send an e-mail note to Art Keller at kellera@us.ibm.com .

<>11K8437 IBM ViaVoice Dictation for Linux U.S. English (includes Microphone Headset): $39.95
 
 

 

top
Debate
Evolution
CogSci

Maintained by Francis F. Steen, Communication Studies, University of California Los Angeles


CogWeb