This chapter provides a general overview of DECtalk. Topics include:
The accuracy of word pronunciation is higher than in any previous version of
DECtalk Software. There have been significant improvements in the accuracy and
quality of letter-to-phoneme rules. Also, DECtalk Software has a large built-in
dictionary that is used in the accurate pronunciation of individual words as
well as enhancing their rhythmic naturalness.
Certain heuristics have been improved and made more intelligent. For example,
DECtalk Software is able to recognize and parse unpronounceable sequences such
as uppercase initials (FBI, AAA, and so forth) in addition to the normal
unpronounceable sequences such as those with no vowels (CBS or NBC, for
example).
Pronunciation
Heuristics
The Text-To-Speech API is the Digital extension to Multimedia Services for
Digital UNIX. You can use this API to write your own applications. You will
need the DECtalk Software Development kit in order to access the APIs.
DECtalk
Software API
The API function set gives you a flexible method of manipulating DECtalk
Software functionality from within your application. These functions perform a
wide range of tasks associated with the Text-To-Speech system. See DECtalk
Software Programmer's Reference Guide (QA-228AA-WZ.4.2A) for a complete list of
API functions.
DECtalk Software programming aids include Voice-Control
commands, also called inline commands. These commands can be used to
perform simple voice-control operations, such as changing the speaking rate or
speaking voice while DECtalk Software is speaking, or more complex operations,
such as modifying the characteristics of each voice, controlling intonation and
stress within written text, or creating special effects such as singing.
Commands are inserted into ASCII text files displayed in one of the program
applets or directly into the application sources through the API functions.
Voice-Control
Commands
Commands have special syntax rules and components that you need to use when you
insert them into files.
DECtalk Software has two pronunciation dictionaries: a large internal
(built-in) dictionary and an optional user-defined dictionary. With the large
built-in dictionary, developers can easily use many proper names and normally
unpronounceable sequences, such as uppercase initials, in their applications.
With the user dictionary build tool, developers can load application-specific
words, or cultural- or language-specific terms into the user dictionary. A
sample user-dictionary file is installed with the software.
DECtalk
Software Dictionaries
The DECtalk Software components now installed on your system include:
DECtalk Software Components
say [-h] [-s #] [-r #] [-d #] [file] [-a "text"]
-a "text" This command line switch is followed by the quoted string.
The text in the quoted string is spoken, at the end of which
the program returns to the Digital UNIX command prompt.
-d # Is used to select the audio output device.
-e # Is used to select the the output wave file format. Integers
1 to 3 are valid input to this option and they specify the
following:
1. PCM, 16 bit Mono 11 KHz format
2. PCM, 8 bit Mono 11 KHz format
3. Mu-law, 8 bit Mono 8 KHz format
-f <filename> Output wave file name
-h Displays the command line parameter list
-r # Speaking rate (75 - 650)
-s # Speaker number (1-9)
<filename> Name of an input ascii file to synthesize.
The mailtalk is a program applet included with DECtalk Software that announces
the arrival of mail messages as they are delivered to your system. Depending on
the options you select, mailtalk announces the sender of the message, its
subject, or both. A more detailed explanation of this program is presented in
next chapter.
mailtalk
Program
aclock announces the time of the day. It takes the following command line
parameters:
aclock
Program
aclock [-h] [ # ]
where # is the interval in minutes
5 - every five minutes
15 - every fifteen minutes
30 - on the hour and half hour
60 - on the hour
-h - Displays the command line parameter list
The user dictionary program, windict, is used to create special dictionary
files. The dictionary file contains words which have special user-specified
pronunciation rules. Dictionary work files are compiled into dictionaries that
can then be loaded into the speak and say programs. More details of this tool
are provided in the next chapter.
User Dictionary Program (windict)
The following unsupported applications are shipped with DECtalk Software 4.2A.
Unsupported applications are provided to demonstrate the advanced capabilities
of DECtalk Software. They are provided for demonstration purposes only and are
not fully supported by Digital Equipment Corporation.
Unsupported Applications
DECface is a computer-generated, synthetic face that synchronizes facial
movements to synthesized speech provided by DECtalk. As DECtalk generates
speech, DECface displays the facial expressions of a human actually speaking
those words.
DECface
DECface offers the ability to develop a large variety of new applications by combining the audio functionality of a speech synthesizer with the graphical functionality of a computer-generated face. A synthetic character can give multimedia presentations, or monitor a system and report anomalies as a feedback agent.
DECface enhances DECtalk by providing an obvious and immediate visual feedback mechanism. In particular, multimedia projects involving direct user interaction can be enhanced to better attract and maintain the attention of viewers.
Specific information on how to invoke and use DECface can be found in the documents located in:
/usr/opt/DTKRT420/decface/docs
or by typing:
Information on how to use emacspeak is provided in the documents located in the directory:
/usr/opt/DTKRT420/emacspeak/docs
or by typing:
Click here for Picture
The DECtalk application user accesses the application through the Motif windows
environment or at the Digital UNIX command line. DECtalk Software also provides
a CDE integration subset that can be installed on systems that support CDE.
DECtalk Software provides several methods of control. The user can use the
abbreviated command set provided with the application to control basic
operations, such as, the speaking rate or the speaking voice. The user can
also use the user dictionary to fine-tune the application's basic pronunciation
and voice characteristics. Finally, the user can also embed in-line commands
into text files to control DECtalk operations. Refer to the specific sections
for more information on which method to use.
By
the Application User
Click here for Picture
DECtalk Software converts ASCII English language text into speech output
through a speech synthesizer. There are two ways to feed text into the speech
synthesizer: through the user interface or through the API. The flow of the
text-to-speech process is explained below.
How DECtalk Software Works
Figure: Flow of the DECtalk Software Text-to-Speech Conversion Process