When processing almost any text, we need to find the words. This involves splitting the input character sequence into tokens and normalising each token into words.
Every user of a language holds a lot of knowledge about that language in their mind. One way to capture and make use of that knowledge is in the form of rules.
Finite State Transducers provide general-purpose machinery for rewriting an input sequence as an output sequence. They have many uses, including verbalising NSWs into natural language.
This video introduces the notion of phoneme as a basic unit of phonological analysis.
Difference between Phonetics and Phonology.
[ ]
can indicate varying degrees of detail/ /
can only indicate abstract categories of phonemic contrastEasy to recognize the difference in underlying level.
Hard to recognize the difference in underlying level, for English.
These surface representations, represented between square brackets are known as allophones and they are language specific.
In Mapudungun, we can recognize the difference.
The language specific allophones, in English 2 surface representation with 1 underlying representation, and in Mapudungun 2 surface representations with 2 different underlying representations separately.
More examples
The phoneme inventory is a design choice when we build a TTS or ASR system. The IPA is a helpful guide when making this choice, but we don’t have to obey it, and are free to make different choices.
Prosody for Text-To-Speech can be reduced the the problem of predicting pausing, duration, and F0.
Because a decision tree only asks simple ‘yes or no’ questions about predictors, it works for both categorical and continuous predictors, or a mixture of both.
Having defined the model, we now need an algorithm to estimate it from data. For a Decision Tree, this is a simple greedy algorithm.
Training data
Goal: making query and reducing entropy of the probability distribution.
Stop condition: result data set is small or result is acceptable or the depth of tree is reach the limit.
Origin: Module 5 speech synthesis – phonemes and the front end Translate + Edit: YangSier (Homepage)