We want to generate speech that is
Linguistic specification guides what we generate
We use decision tree when we want learn rules from data.
Diphone database Requirements
The most common use of lexical stress marking is for determining which syllable in a word a pitch accent will be placed on if that word is made prosodically prominent.
The Tone and Break Indices (ToBI) model of prosody basically aims to capture prosodic prominence (pitch accents), boundary tones, and the extend of prosodic breaks (break indices). It doesn’t try to capture pragmatic or affective content of speech such as speech acts or emotions.
Spectral smoothing, as the name suggests, will help spectral discontinuity by making the change in the spectrum more smooth across a join.