                           CHAPTER 5
                            PHONEMICS
                                
DECtalk PHONEMIC INPUT
This  chapter describes the phonemic (sound) system  of  English
used    in   DECtalk   and   the  ways  to   control   DECtalk's
pronunciation.

DECtalk   represents  the  state-of-the-art  in   text-to-speech
synthesis. The software shipped with DECTALK PC is the latest in
DECtalk  speech  microcode. It contains a number of  significant
improvements  over  its  predecessors.  It  will  contain  fewer
pronunciation   errors  and  will  handle   text   in   a   more
sophisticated  way.  It  should  also  sound  more  natural  and
intelligible.  Naturalness  in synthesized  speech  is  evolving
slowly   because  of  the  inherent  complexity  of   accurately
replicating human speech as well as the difficulty of adequately
defining  what naturalness itself means. However, the  developerwill  find that the use of phonemic transcription should  become
less necessary with this added sophistication.

Note:   DECtalk  V4.x  seldom  makes  errors  in  pronunciation.
Developers  of screen reader software who also use other  speech
synthesizers should be aware that creating special phonemic text
often   creates  problems  where  none  exist.  To  avoid  this,
ascertain whether a particular pronunciation is a problem before
attempting  to  phonemicize a word or phrase.  Also,  developers
should  be  aware  that  different synthesizers  take  different
phonemic alphabets and the one described below will probably not
be interchangeable with other synthesizers.

PHONEMIC TRANSCRIPTION
Most  users do not need to know anything about DECtalk  phonemic
input  and may never need to use the phonemic alphabet. This  is
because  improvements made in text-to-speech technology  in  the
past  few  years  make  it unnecessary  to  have  to  modify  an
incorrect  pronunciation of a normal word.  On the  other  hand,
many  developers  will  want  to  enter  unusual  words  in  the
definable  dictionary or, for various reasons, modify the  sound
of  the synthesized speech, perhaps to attain a higher degree of
naturalness,  to  demonstrate  emotion,   or  to   emphasize   a
particular word or phrase. In these cases, it may be helpful  to
understand in a bit more detail how DECtalk works.

To  understand how the DECtalk system works and to  make DECtalk
correctly  pronounce any English word, you  may  wish  to   know
something  about speech sounds and how to represent  them  on  a
keyboard.  Because  spelling in English  does  not  always  show
exactly   how words are pronounced, dictionaries use symbols  to
show  how  words really sound. Sometimes these symbols  are  the
same as  letters used in spelling. A word written the way it  is
pronounced is said to be in  phonemic transcription or simply in
phonemics.

PRONUNCIATION ERRORS
When DECtalk says a word or phrase incorrectly, you may need  to
use  phonemic  input  to  get  the  desired  pronunciation.  The
following   list suggests the most common types of  errors  that
DECtalk makes,  and the best corrective action.

Note:   Prior   to  using  phonemic  transcription   or   clever
misspellings,  ascertain that DECtalk does  indeed  mispronounce
the  word.  In  the  vast majority of cases, the  word  will  be
pronounced  correctly.  To  utilize phonemic  representaions  of
words,  the  Phonemic  Mode must be turned  ON  with  a  special
command (above).

MISPRONNOUNCIG A PROPER NAME
       DECtalk mispronounces a proper name.
               Lee Iacocca
       Corrective action: Convert to phonemic form.
               Lee [ayaxk'owkax]
       Or misspell in a clever way.
               Lee Eye a Coke a.

MISPRONOUNCING AN ACRONYM
       DECtalk mispronounces an acronym.
               The UN building
        Corrective  action:  Respell  with  spaces  between  the
letters.
               U N
       Or use phonemics.
               ['yuw 'ehn]

MISPRONOUNCING AN UNFAMILIAR WORD
       DECtalk mispronounces an unfamiliar word.
               articulatory
       Corrective action: Convert to phonemic form.
               [aart'ihkyaxlaxtowriy]

STRINGS CONTAING NONALPHABETIC CHARACTERS
DECtalk  mishandles  a  letter string  containing  nonalphabetic
characters:
                  autoexec.bat
		  readme.txt
       
	Corrective action: Respell with inserted spaces.
               auto exec dot bat
               read me dot  text
       Or convert to phonemic form.
               ['aotowixgz`ek*daat*b'aet]
               [r'iyd*miy*daat*t'ehkst]

AMBIGUOUS PRONUNCIATIONS
DECtalk guesses incorrectly for an ambiguously pronounced word.
               The insert
               Get the lead out.
       Corrective action:  Convert to phonemic form
               The ['ihnsrrt]
               Get the [l'ehd] out.

SYNTACTIC CLASSFICATION
DECtalk uses the wrong syntactic classification of a preposition
or  particle.
                He  takes on tough jobs.   ("He does tough jobs"
versus "He accepts  graft  when  on  tough jobs.")
       
Corrective action: Add a stress phoneme when needed.
               He takes [']on tough jobs.
       Or convert to phonemic form.
               He takes ['aan] tough jobs.

INCORRECT PHRASING
DECtalk uses the wrong phrasing.
               Following a long gasp shouts were heard.
         Corrective   action:  Add  commas  or   a   verb-phrase
introducer phoneme where needed.
               Following a long gasp, shouts were heard.

INTRODUCTION TO PHONEMIC THEORY
At  one time long ago, English was pronounced as it was spelled,
with  each  letter (or pair of letters) representing one  sound.
Because  of  historical sound changes such as the  dropping   of
sounds  like  the  gh of "bought" or the k of  knight  and  word
borrowing  from  other  languages, English  pronunciation  rules
have  become complex and  include many exceptions. For  example,
of  is  pronounced with a v sound, while all other English words
spelled  with  f  are pronounced with an f   sound.  The   vowel
sequence ea can be pronounced in at least a half-dozen  ways, as
illustrated by the sounds in the words cheap, head,  earth,  and
idea. The letters th can be pronounced with a  voiceless phoneme
as  in  thin, or with a voiced phonemeas in this; or the th  can
represent  the t phoneme followed by the h phoneme  in  compound
words such as pothole.

Some  words have two pronunciations, for example, read.  Correct
pronunciation of a sentence such as Will you read  the  book  or
have  you  read  it  already? requires an understanding  of  the
meaning  of the sentence - a task which DECtalk is  learning  to
do. DECtalk can often correctly predict which pronunciations  is
correct.  However,  because  of  the  nature  of  language,   it
occasionally makes a mistake. If this occurs, you  can  get  the
alternate pronunciation in two ways.

       By misspelling the word, e.g., "red" for "read"

       By phonemic spelling: [r'ehd]
        Example:  Will you read the book or have you [r'ehd]  it
already?

Stress  is an important part of phonemic representation.  Stress
alone  distinguishes the two different pronunciations  of  words
like  "insert."

English words usually have one syllable that is spoken with more
stress  than  the other syllables in the word. You can  indicate
this   primary stress to DECtalk by placing the phonemic  symbol
[']  before the vowel. The ['] symbol is described below.

For example, the word "insert" can be spoken as a noun
       "insert" = ['ihnsrrt]
 and as a verb
        "insert" =  [ixns'rrt].

Considering  the complexity of English pronunciation  rules  and
the   number  of exceptions, it is not surprising  that  DECtalk
occasionally   makes such pronunciation errors. You  can  adjust
DECtalk  pronunciation   through  a  large  number  of  symbols,
described  in  the  rest  of this  chapter.  DECtalk  V4.x   has
improved  pronunciation rules and, as a  result,  such  phonemic
intervention will only occasionally be needed.

PHONEMES
A  phoneme is the smallest unit of speech that distinguishes one
word  from  another.  Of all the sounds that  human  beings  can
produce,  relatively few are significant in  any  one  language.
Only  about 40 different functional sound types or phonemes  are
used in General  American English.

Prounouncing Phonemes
The  phonemes  of English are not pronounced the same  by  every
speaker. We all know people who pronounce some words differently
from the way we do, yet we understand them. The differences  may
occur  because  we  come from different parts  of  the  country.
Because   of  these  variations, there is no  such  thing  as  a
universal   standard pronunciation of American English.  DECtalk
attempts to mimic a  Midwestern (Northern Milwaukee) dialect.
Because DECtalk pronounces a phoneme in a standard rule-governed
way,  it  is not possible to imitate all other English  dialects
(although  you  can  approximate some dialectal  differences  by
phonemic spelling).

The   following  sections  describe  the  vowel  and   consonant
phonemes,   stress  and syntactic symbols, and  optional  direct
control of  intonation or singing.
VOWEL AND CONSONANT PHONEMES

Linguists  have  identified  about  17  vowel  phonemes  and  24
consonant  phonemes for American English.  Tongue position (high
versus  low  in the mouth, and front versus  back of the  mouth)
correlates  with  the  frequencies of the  two   lowest  natural
resonances  of the vocal tract. The lowest resonance  frequency,
is  the   first  formant  F1  and the  second  formant   is  F2.
Consonant  phonemes are typically described by their  places  of
major articulatory  constrictions and the manners of forming the
constrictions.

PHONONEMIC REPRESENTATION
Appendix B lists the  consonant and vowel phonemes of English as
used  by  DECtalk.  The  symbols  used  for  each  phoneme   are
identified  by  a key word with the relevant phonemic  sound  in
italics.

        In  many  cases, phonemes are indicated by two  letters,
instead  of  special characters or diacritic symbols that  often
appear  in   dictionaries. DECtalk requires  a  case-insensitive
representation    (uppercase  and  lowercase   are   acceptable)
although lower case is the more commonly used. The letter  pairs
have   been designed so that it is not necessary to put a  space
between   phonemes of a word. In fact, the space indicates  word
boundaries.   DECtalk can parse input phonemic letter  sequences
to determine the  unique phoneme sequence in all cases.

        Phonemes  are  enclosed in square  brackets  instead  of
between  the  more traditional / symbol. The [ and ]  characters
mark  the   beginning  and  end of phonemic  mode  clearly  with
distinctively   different  symbols.  The  input  format  is  not
strictly phonemic  because it also permits you to enter  certain
allophones  (variants  of a phoneme), making the  representation
closer  to  a  broad  phonetic transcription. When  the  command
[:phoneme  arpabet  speak on] is given, all text  within  square
brackets is treated as phonemic text.

PHONEMIC CORRECTION THE EASY WAY
Developers may wish to learn the phonemic code. However, you can
also  consult  one  of  the commonly available  dictionaries  to
determine  the  phonemic pronunciation for the  occasional  word
that DECtalk gets wrong.

For  example,  according to the Merriam-Webster Dictionary,  the
pronunciation of the word "Mozart" is:

       \'mot-,sart\

Using the Table of Appendix B you can convert this transcription
to the DECtalk phonemic string

       [m'owtsaart]

THE USER DICTIONARY
Every  time  DECtalk mispronounces a word in running text,  your
application  could  replace the text string ("Mozart")   with  a
phonemic string ([m'owtsaart]). However, if the number of  words
requiring  phonemic translation in an application is  small,  it
might  be simpler to create and download a dictionary to DECtalk
and  let  DECtalk  perform the  replacement  automatically.  The
DECtalk  board  has memory allocated for a loadable  dictionary.
This  dictionary is useful in cases where (a) DECtalk  makes  an
error in pronunciation, or (b) the pronunciation of a string  is
unique  to  the application.  For example, if the sequence  n/cl
should  be  pronounced as not cleared, then  a  user  dictionary
entry is obviously needed.

        To create and download a dictionary to DECtalk, you must
do the following:

1.  Create  a  dictionary table file using a text   editor.  The
dictionary must be in the following format:

        (a)  An entry must start at the first character  of  the
line.   Any word beginning other than as the first character  of
the  line causes the line to be treated as a comment and it will
therefore not be processed.

        (b)  The  syntax is grapheme string followed by  phoneme
string. A line may be up to 256 characters long.

        (c)  A  grapheme (letter) string is comprised  of  legal
graphemes.  Legal  graphemes are:  A-Z,  a-z,   0-9  and  select
punctuation  marks  ("!,  @, &, (,  ),  -,  \,   and  /).  These
punctuation  marks  may  not be used at  the  beginning  of  the
grapheme  string.  The grapheme string may be  in  either  case.
Uppercase letters match only uppercase; lowercase letters  match
either uppercase or lowercase.

        (d)  The  phoneme string is comprised of legal phonemes.
Phonemes  are  always in square brackets but may  be  in  either
upper or lower case.

For  example,   to make the word "coffee" be pronounced "tea",  you
would enter the following:

               coffee  [t'iy]

After  creating your dictionary file, you can compile  and  load
the dictionary by doing the following:

2. Compile the dictionary by typing:

        userdic  <input  dictionary  table>  <output  dictionary
file>

Input  files  have  the default extension of  .tab  but  can  be
anything. Output dictionary files have the extension of .dtu and
must   have  that  extension for the loader  to  find  the  file
correctly. If no output file is specified, a file with the  same
name and .dtu extension will be created for the output.

For example: if your dictionary table is called mydict.tab,  and
you type:

       userdic mydic

The USERDIC program will use the dictionary table mydict.tab  to
create the dictionary mydict.dtu

3. Load the user dictionary by typing:

       dt_load         <output file>

For example, you would type:

       dt_load  mydic.dtu

Your customized dictionary is now loaded.

Note: User dictionary lookups are done only on a single form  of
the  word. No affix stripping occurs in user dictionary lookups.

Therefore,   inflected  and  derived  forms  must   be   entered
separately.

Warning:  If  your  PC  is powered down,  you  must  reload  the
dictionary at power-up or add its name to the end of the DT_LOAD
statement in the AUTOEXEC.BAT file.

Do  not  automatically assume that DECtalk will  mispronounce  a
word,  even  a difficult one. DECtalk often correctly determines
the correct pronounciation of even difficult or very complicated
words. Also, using the [:pronounce name] command, it will  do  a
creditable job with proper names as well.

VOWEL ALLOPHONES
While   DECtalk recognizes 17 vowel phonemes, these  vowels  can
sometimes   change slightly when surrounded by certain phonemes.
These  variants are discussed below.

Allophones for  Vowels + [r]
The  vowels in words  such as "beer," "bear," "bar," "bore," and
"poor"  are  different   from the available  vowel  phonemes  in
DECtalk.  They  require special  vowel-r allophones,  which  are
listed below.

The Schwa Allophones [ax] and [ix]
Another  problem  is with the unstressed  reduced  vowel  called
"schwa"  in English. The vowel appears in  words such  as  about
and  kisses. In "kisses," the vowel is  produced with  a  higher
tongue  position, symbolized by the vowel  allophone  [ix].  You
can  choose between [ax] and [ix]  by noting the characteristics
of  the  adjacent  phonemes, but  listening to  the  words  will
result in the best choice.

Syllabic Consonants
The  final  syllable in words such  as "butter,"  "bottle,"  and
"button" is usually symbolized in a  dictionary as consisting of
a  short  vowel  followed by a consonant.   For  better-sounding
synthesis,  DECtalk  uses a set of syllabic   consonants,  [rr],
[el],  and  [en]   that are realized  without the  short  schwa.
Syllabic  "r" shares the same symbol as  the phoneme [rr]  in  a
word  such  as  "bird," but this leads to no   confusion  inside
DECtalk.

The  [em]  allophone used in the earliest version of DECtalk  no
longer   exists and must be replaced by the two-phoneme sequence
[axm] as  in the word "bottom" = [b'aataxm].

In  most  situations,  you do not need  to  be  concerned  about
allophones  because  the vowel phonemes will  be   automatically
converted into the appropriate allophones by DECtalk  rules. For
the developer, allophone selection can be  induced or blocked by
using  the  syllable boundary phoneme [-] and  the rule-blocking
phoneme [~] , or by  inserting allophone symbols in the phonemic
spelling.

CONSONANTS
The  symbols  that represent consonants are straightforward.  In
one  case,  [hx] , the two-letter sequence ensures   unambiguous
parsing because the letter "h" is part of  some vowel symbols.
DECtalk  speaks  an  English dialect that does  not  distinguish
voiced  and  voiceless  w. Therefore,  words  like  "which"  and
"witch"  are pronounced alike as [w'ihch].

The  letter  "g" can be pronounced in two ways.  In  words  like
"gift," the consonant phoneme [g] is used. In words like  "gin,"
the phoneme [jh] is used.

The  letter  sequence "th" can be pronounced  with  a  voiceless
sound   [th]  as  in "thin" or with a voiced sound  [dh]  as  in
"this."

Consonant Allophones
The consonants [t], [d], [r], and [l] may be replaced by special
allophones under certain conditions.

Dental Flap [dx]
The  [t]  and [d] phonemes are often  replaced by a  very  brief
tongue  flap allophone [dx] when the  consonant phoneme  appears
between  two vowels and the second vowel  is unstressed. DECtalk
rules   automatically  insert  this  allophone  in   appropriate
situations.

Glottal t
The   [t]  phoneme  may  be  replaced  by  a   glottalized  [tx]
allophone,  especially in the word-final position  if  the  next
word begins with a sonorant consonant. DECtalk rules  insert the
allophone where appropriate.

Postvocalic [r]
The  [r] that appears after a vowel is  not as constricted as  a
word-initial [r]. DECtalk  automatically  selects this  somewhat
velarized  allophone  [rx]  or  an  r-colored   diphthong  where
appropriate.

Postvocalic [l]
The [l] that appears after a vowel may  sound different from the
[l]  in  other contexts. For some speakers,  the tongue tip  may
not  even  reach  the  roof  of  the  mouth.  This   postvocalic
allophone [lx] is automatically selected by DECtalk.

Glottal Stop [q]
The  glottal stop [q] is used in some  situations to indicate  a
word  boundary,  especially when the next  word  begins  with  a
vowel.  Overuse of this symbol can lead to a  stilted  style  of
speaking.

CONTROLLING ALLOPHONE SELECTION
DECtalk automatically  inserts certain other allophones for [k],
[q],  and  [nx] when  appropriate. It also selects the prevoiced
and  voiceless  unaspirated allophones of [b], [d], and [g]. You
cannot access  these allophones.

  If  DECtalk does not select one of these allophones, you   can
insert   the   allophone   symbol   directly   in   a   phonemic
representation of the word in question.

If  DECtalk uses one of these allophones inappropriately,  place
the  rule-blocking phoneme [~] before the phoneme in question to
block   application  of all allophonic substitution  rules.  For
example,  to  say "batter" without a flap being substituted  for
the [t], enter  the phonemic string [b'ae~trr].

SILENCE PHONEME [_]
DECtalk  automatically inserts a silence (brief pause)  whenever
punctuation appears in the text. The phonemic silence symbol [_]
is  useful  for  controlling silence  while  in  phonemic  mode.
Silences  and other pauses are described in more detail below.

STRESS AND SYNTACTIC SYMBOLS
Correct  speech is more than simply stringing together a  series
of   words or phonemes. The meaning of a sentence is carried  by
the   words, plus rhythm, stress, and intonation (pitch change).
You  recognize a question by the rising intonation of the voice,
while  a statement is usually accompanied by falling intonation.
A   speaker can give certain words in a sentence more importance
by   adding  stress (loudness, pitch and length) to them.  Pitch
often reveals  the emotional state of the speaker. For effective
communication,   you need to consider these expressive  features
as well as the  segmental features of speech.

As  any  good  actor knows, punctuation alone is not  enough  to
indicate  the  full meaning of a sentence. Some fine  points  of
expression  cannot be indicated by using phonemic symbols.  Full
control  of  the expression of a sentence is gained by  directly
changing  the  duration and pitch of words and  phrases  and  by
inserting pauses in the appropriate places.
DECtalk uses stress and syntactic symbols to control aspects  of
rhythm,  stress, and intonation patterns. These symbols  include
punctuation  marks  such  as commas,  periods,  and  exclamation
marks.    Punctuation  marks  are  recognized  by   DECtalk   as
indicating   special   phrasing  requirements.   The   following
sections explain  how to improve the phrasing in DECtalk speech.
                                
                  STRESS AND SYNTACTIC SYMBOLS
                                
               Stress Symbols
       '               Primary Stress
       `               Secondary Stress
       '               Emphatic Stress
       /               Pitch Rise
       \               Pitch Fall
       /\              Pitch Rise and Fall

               Syntactic Symbols
       -               Syllable Boundary
       *               Morpheme Boundary
       #               Compound Noun
       (               Beginning of Prepositional Phrase
       )               Beginning of Verb Phrase
       ,               Clause Boundary
       .               End of Sentence
       ?               End of Question
       !               End of Exclamation
       +               New paragraph

Primary Stress [']
Most  content  words of English (nouns, verbs,  adjectives,  and
adverbs)   contain   one  primary  stressed  syllable.   DECtalk
represents  primary stress on a syllable with an apostrophe  [']
placed  immediately before  the stressed vowel  phoneme  of  the
word as in the following  example for the word butter.

       [bahtrr].
        (No stress, flat intonation, too rapid.)
       [baht'rr].              (Stress on the wrong syllable)
       [b'ahtrr].              (Correct)

You  can also place the primary stress symbol between words,  in
which  case  it  modifies the next word.  For  example,  in  the
sentence   "He  rang  up the sale," DECtalk  treats  "up"  as  a
preposition   (without stress) instead of a  particle.  "Up"  is
correctly stressed  if you write the sentence as:

       He rang [']up the sale.

There  can  be no space between a stress phoneme and a syntactic
phoneme (for example, [']) and the following word.

Secondary Stress [`]
Use  the  secondary stress symbol [`] to indicate  a  degree  of
stress  that is between primary stress and unstressed. Secondary
stress is  appropriate in the following cases.

To highlight the next strongest syllable of polysyllabic  words,
such as "demonstration."
               [d`ehmaxnstr'eyshaxn].
        On  second  parts  of compound nouns, as  in  "answering
machine.
                ['aensrrixnx#maxsh`iyn].

        In some very common words such as "I" and "we."
DECtalk   realizes  secondary stress by  lengthening  the  vowel
sound   more  than unstressed (but less than primary stress).  A
pitch  rise   may also occur on an early secondary stress  .  In
most cases, you  can leave out the secondary stress symbol.

Emphatic Stress ["]
You can place the emphatic stress symbol ["] before any vowel to
give  emphasis  to that syllable of the word.  Good  readers  of
English text understand the message of the sentence well  enough
to   pick  out the most important word and emphasize it. DECtalk
merely   pronounces words; it does not understand the  sentences
it is  saying. DECtalk cannot place emphasis on words to give  a
completely different meaning to the sentence unless you use  the
emphatic stress symbol. Here is an example.

       Dennis loves Mary.
        (Usual neutral pronunciation.)
       [d"ehnihs] loves Mary or "["]Dennis loves Mary.
        (Dennis -- not Frank -- loves Mary.)
       Dennis loves [m"ehriy] or Dennis loves"["]Mary.
        ( -- not Jill.)

The  exclamation point has a similar effect on the final  stress
of a sentence.
       Help!

Unstressed Syllables
The  English  language contains a set of words that  are  either
unstressed  or have reduced stress. These are called  syntactic
function words and include the  following  types:
       Prepositions (for, over)
       Conjunctions (and, but)
       Determiners (the, some)
       Auxiliary verbs (is, has)
       Pronouns (her, myself)
       Clause introducers (which, that)

These words have reduced stress in their dictionary entries.  It
is  sometimes  necessary to emphasize a function  word  that  is
stored  in DECtalk's dictionary without stress. You can do  this
by   including  a  primary stress symbol or an  emphatic  stress
symbol  in   the  phonemic transcription  as  in  the  following
example.
       
       He went ['owvrr] (or [']over) the fence, not under it.
       It was the fence that he went ['owvrr] (or [']over)

Pitch Control  [/], [\], [/\]
DECtalk  contains built-in rules to determine the pitch  contour
of   a sentence. While these rules are correct most of the time,
you  can override them by placing the pitch rise [/], pitch fall
[\],  and pitch rise-and-fall [/\] symbols before selected words
(or  vowels if you want finer control).

The  [/]  and  [\] symbols must alternate, and the first  symbol
must   be a rise. Note that you can place both a rise and a fall
on the  same syllable by using [/\]. You can hear the difference
by trying  the following two sentences.

       It's a mad mad mad mad world.
       It's a [/]mad [\]mad [/]mad [\]mad [/\]world.

Word Boundary
Any whitespace character (space, tab, or carriage return) in the
text  indicates  a  word boundary. DECtalk  uses  word  boundary
symbols to select the word-beginning or word-ending  allophone
of a  phoneme.

Some  applications automatically insert a carriage  return  into
lines that are too long (and would go off the edge of the screen
or  paper). This may cause DECtalk to pronounce text incorrectly
if a carriage return occurs in the middle of a word. You  can
prevent  this problem by breaking long sentences with a carriage
return at  an appropriate place.

Syllable Boundary [-]
DECtalk uses a set of rules to determine where words break  into
syllables,  so  consonants within words are  assigned  to  their
correct  syllable. Use the syllable boundary symbol [-] to  tell
DECtalk  where to assign the consonants within ambiguous  words.

(This type of error rarely happens in DECtalk).
       Example: oration
        [ow-r'eyshaxn] (DECtalk made an incorrect guess.)
        [owr-'eyshaxn] (Correct.)

Morpheme Boundary [*]
        English  words  are made up of meaningful  units  called
morphemes.  For  example, "spell" has only one  morpheme,  while
"misspelling" is  made up of three: "mis," "spell," and "ing."
In  most  cases, the pronunciation of a word does not depend  on
morpheme  boundaries. There are exceptions,  however,  in  which
case  the morpheme boundary symbol [*] can be used to force  the
correct   pronunciation.  For example, "misspelling"  should  be
pronounced   with  a double "s"because each  "s"  belongs  to  a
different   morpheme.   Adding  the  morpheme  boundary   symbol
improves  the pronunciation of the word.

       misspelling.

       mixsp'ehlixnx (text-to-phoneme translation by DECtalk).
        (The single "s" is too short.)

       [mixs*sp'ehlixnx]
        (Better.)

Compound Noun [#]
Compound  words, such as rush-hour, coffee cup, Thermos  bottle,
answering  machine, etc. should be spoken with less   stress  on
the second word. Also, words that were once compounds,  such  as
backache require decomposition for correct  pronunciation.
DECtalk's  dictionary  includes an extensive  list  of  compound
words.   You  can  use the compound-noun symbol [#]  to  correct
compounds  that   are not in the dictionary.  For  example,  for
"backache," type the  following phonemic transcription.

       [b'aek#`eyk].

Using  a  hyphen in compound words, for example,  back-ache,  or
rush-hour  traffic" produces the correct pronunciation  most  of
the time. You  rarely need the [*] and [#] phoneme symbols.

Beginning of Verb Phrase [)]
        Moderately long declarative sentences are usually spoken
as  if  they contain two units: a noun phrase and a verb phrase.
There  is   sometimes a slight pause between these two  phrases,
but there is also a slowing down at  the boundary, and the pitch
tends  to  fall  and  then  rise.  DECtalk   searches  for  this
syntactic  boundary  to change pitch. However,  the   rarity  or
ambiguity of some verbs can cause confusion.

       The old man in the chair was rocking slowly.
       (Correct verb phrase detected.)

       The old man in the chair sat rocking slowly.
        (Verb  phrase not detected; pure mechanical analysis  of
the sentence   does not show where "sat" belongs.)

       The old man in the chair [)s'aet] rocking slowly.
       (Phonemic correction.)

The right parenthesis [)] symbol is useful where a separation is
needed  between phrases but a comma is too strong. For  example,
you  can use [)] to indicate a dangling prepositional phrase.

       She hit the man with the umbrella.
       (The man carries the umbrella.)

       She hit the man [)] with the umbrella.
       (She uses the umbrella.)

NOTE:  Past versions of DECtalk also used the [)] symbol  for  a
second  function to indicate alternate  pronunciations of  words
that   are    spelled   the  same  but  pronounced   differently
(homographs).  (In  DECtalk V4.x, this has been  replaced  by  a
slash "/".) For example, the word  "insert" is either a noun  or
a verb. As a noun, it is pronounced  ['ihnsrrt] and as a verb it
is pronounced [ixns'rrt].
Clause Boundary [,]

When  a  sentence is composed of more than one clause, it should
be   spoken in such a way that the listener can easily  separate
the   sentence into its component clauses. The comma [,] is  the
symbol  used to indicate clause boundaries. A comma in text  and
a  comma in  phonemic transcription have identical impact on the
acoustic realization  of a sentence.
Inserting  a  comma improves the quality of spoken sentences  in
the  following cases.

       After an introductory prepositional phrase:
               In particular cars cause pollution.
               (Poor phrasing.)
               In particular, cars cause pollution.
               (Correct.)

       Around a parenthetical remark:
               A picture it seems is worth . . .
               (Poor phrasing.)
               A picture, it seems, is worth . . .
               (Correct.)

       In a list of more than two items:
               They ate apples oranges and bananas.
                       (Poor phrasing.)
                 They   ate   apples,   oranges   and   bananas.
(Correct.)

       After similar types of adjectives:
                   The     tall     angular    gentleman     ...
(Poor phrasing.)
                   The     tall,    angular    gentleman     ...
(Correct.)

        Around  phrases  and  clauses  in  a  particularly  long
sentence

Period [.]
A sentence is usually a single, complete thought. It is also the
longest  utterance that you can comfortably speak in one breath.
DECtalk  inserts a pause when it finds a period that  marks  the
end   of the sentence, duplicating the human speaker's pause  to
take a  breath.

The  [.] symbol also tells DECtalk that a complete sentence  has
been  sent and it is safe to begin speaking. In letter and  word
mode,  DECtalk will speak immediately even if no period or comma
has  been seen.  DECtalk also tests each period to make sure  it
is  not part of a known abbreviation.

Question Mark [?]
The  simplest  way to indicate a question in  English  is  by  a
rising   tone  at the end of a sentence, although true  question
intonation  is not that simple and depends on the meaning of the
question.

There  are  many  cases  in English where  a  question  (rising)
intonation  is  not appropriate, even though the  sentence  ends
with   a  question mark. Rhetorical questions or quotations  may
contain  a  question  mark, but the speaker ends with  a  period
(falling  tone).  Sentences that begin with "wh"  words  ("who,"
"what")  usually  end  with a falling tone,  even  if  they  are
questions.   DECtalk is smart enough to recognize "wh" questions
and  speak them correctly.

       Laura ate her broccoli?
        (DECtalk asks a question.)

        What time is it?
         (DECtalk recognizes a wh-question and does not rise  at
the end).

Exclamation Point [!]
Exclamations are short statements spoken with special  emphasis.
DECtalk  interprets an exclamation point to mean that  the  last
stressed syllable in the sentence should have extra emphasis.

       Stop!

Long sentences ending with an exclamation point typically have a
single  word that receives extra stress. DECtalk has no  way  of
knowing  which  word  to stress and chooses  the  last  word  by
default.   Use  the emphatic stress symbol ["]  to  emphasize  a
different word  when the last word is not appropriate.

       Joan won the marathon!
        (DECtalk emphasizes the last word.)

        ["] Joan won the marathon.
        (Correct.)

New Paragraph [+]
The  new  paragraph  phoneme  [+] should  be  inserted  in  text
wherever  a new thought has begun.   (DECtalk does not  do  this
automatically  because  there  is  no   standard  new  paragraph
indicator  in general text - the tab is  used in too many  other
ways.)
  The  new  paragraph phoneme [+] modifies the intonation contour
and  adds variety to running text. The first sentence of a  new
paragraph  is  produced with a higher, more  lively  fundamental
frequency. DECtalk will also pause longer between paragraphs  to
give the listener an indication of a change of topic.
[+]   This  paragraph  has  the  [+]  phoneme  inserted  in  the
appropriate   place. The new paragraph symbol  can  be  used  in
other  situations,  such as to help indicate the start of a  new
mail message in a list  of mail messages.

DIRECT CONTROL OF DURATION AND PITCH
Displaying  the  correct  emotion  through  voice  alone  is   a
difficult   task,  as any radio actor will tell  you.  The  best
method is to  experiment with phonemic symbols until you achieve
the quality you  want. Emotional content is usually connected to
the  sentence  content, so varying both together is the best way
to convey  feelings.

  For  example,  you can have DECtalk say a simple  phrase  like
"Good  morning" in several different ways.
       Good morning.
        (Normal tone.)

        Good morning!
        (Emphatic.)

        Good morning?
        (Questioning.)

        [g"uhd] morning.
        (Emphasize "good.")

If  these alternatives do not produce what you need, you can use
direct  prosodic control. You must represent the entire sentence
phonemically, specifying a duration for each phoneme  that  does
not  match the natural model. You should also give some  or  all
phonemes   specific  target pitch values. DECtalk  will  compute
smooth   transitions between pitch values, where  the  specified
pitch is  reached at the end of the phoneme.

DURATION AND PITCH [<>]
DECtalk  uses angle brackets [<>] to enclose duration and  pitch
values of phonemes. The format is
       <duration,pitch>

where duration is the length of the phoneme in milliseconds (ms)
and  pitch is the fundamental frequency of the phoneme in  hertz
(Hz).

Any  phoneme  may  be followed by angle brackets  to  alter  the
default   duration  and pitch. If either value  is  omitted,  or
specified  as  0,   the default value is used.  The  values  for
duration and pitch are  separated by commas.

       [ow]
       (Normal phonemic specification.)

       [ow<1000>]
       (1,000 ms duration.)

       [ow<,90>]
        (Default  duration,  90  Hz pitch  at  end.)  (note  the
position of the comma)

       [ow<1000,90>]
       (1,000 ms duration, 90 Hz pitch at end.)

For  example,  to say "Oh?" with a greater degree of  skepticism
than  DECtalk normally imparts, you could type

       [_<,90>ow<400,150>].

The  [ow] phoneme begins at 90 Hz and ends (after 400 ms) at 150
Hz.

Note  the  use  of  the silence symbol [_] in the  example  just
given.  Pitch and duration values must always be attached  to  a
preceding  phoneme. The silence symbol is used so that the value
(90 Hz in  this example) is applied to the beginning tone of the
next spoken  phoneme [ow].

Many  of the phonemes (all except the stop  consonants p, t,  k,
b,  d, and g) can be sustained in a monotone  for an arbitrarily
long duration by using direct prosodic control.  For example, to
sustain "ah" for a duration of 10 seconds (10000  ms) at a pitch
of 120 Hz, type

       [_<,120>ah<10000,120>].
        (Produces "ahhhhhhh . . .")

To produce a prolonged sigh, type:

       [_<100,150>ah<2500,80>].

where  the silence phoneme causes the pitch contour to start  at
150  Hz at the beginning of the "ah" and end at 80 Hz at the end
of the  "ah."

SINGING
The DECtalk can be made to "sing" by converting a text file into
a  song file. In the song file each word or syllable first  must
be  defined  phonemically.  Then the  musical  notes  and  their
duration they are to be "sung" at are defined.

Vibrato
Singing   uses   different   voice   control   techniques   than
conversation.  Even untrained singers add liveliness to the sung
notes  by  varying  pitch  slightly, a quality  called  vibrato.
Singing in DECtalk would sound mechanical without vibrato.

Assigning Pitch and Duration
The  first  number  following  a  phoneme  is  the  duration  in
milliseconds, and the  second number is the pitch  in  Hertz(Hz)
or  a  coded  value  for the desired musical  note.  Vowels  and
consonants  not  assigned a pitch remain at the  same  pitch  as
preceding segments.  You can intersperse silence phonemes if you
wish.

Sung and Non Sung Pitches
DECtalk  stays exactly on pitch when the pitch is  specified  in
Hertz.  You  can  add vibrato (to give a more realistic  singing
quality)  by  substituting the pitch  value with  a coded  value
for  the desired musical note. The coded values range from 1  to
37. Note  1 is C2 and 37 is C5 on an equal tempered scale (A4  =
440  Hz). The table of the codes for the musical notes C2 is the
second C below middle C on a piano,  C4 is middle C, and so  on.

The table is listed at the end of this section.
When the coded value of the notes are specified, DECtalk reaches
the  desired pitch within  about 100 ms after the start  of  the
phoneme and adds vibrato after  changing to this pitch. When you
give  a  specific  non-sung pitch,  DECtalk  reaches  the  pitch
target  at  the  very end of the phoneme  with no  vibrato.  The
following example makes DECtalk "sing" the  first four notes  of
Beethoven's Fifth Symphony.

       [d<100,17>aa<400> d<100,17>aa<400>]
       [d<100,17>aa<400> d<120,13>aa<700>]

Musical Note Codes
The  following  table contains the codes of  the  musical  notes
which can be used to allow your DECtalk to sing.

Table  -  Musical Note Codes
Coded  Note    Pitch (Hz)
Value
1      C2      65
2      C#      69
3      D       73
4      D#      77
5      E       82      B
6      F       87      A
7      F#      92      S
8      G       98      S       B
9      G#      103     |       A
10     A       110     |       R
11     A#      116     |       I
12     B       123     |       T
13     C3      130     |       O      T
14     C#      138     |       N      E
15     D       146     |       E      N
16     D#      155     |       |      O
17     E       164     |       |      R

18     F       174     |       |      |       A
19     F#      185     |       |      |       L
20     G       196     |       |      |       T
21     G#      207     |       |      |       O
22     A       220     |       |      |       |
23     A#      233     |       |      |       |
24     B       247     |       |      |       |       S
25     C4      261     |       |      |       |       O
26     C#      277     |       |      |       |       P
27     D       293     |       |      |       |       R
28     D#      311     |       |      |       |       A
29     E       329     |       |      |       |       N
30     F       348             |      |       |       O
31     F#      370             |      |       |       |
32     G       392             |      |       |       |
33     G#      415                    |       |       |
34     A       440                    |       |       |
35     A#      466                    |       |       |
36     B       494                    |       |       |
37     C5      523                            |       |
Note: C4 is middle C
                                

End of Chapter 5.

  
