Quasi-syllabic and quasi-articulatory-gestural units for concatenative speech synthesis

In this paper we propose methods of speech segmentation and unit characterization which are motivated by prosodic and physiological principles. In particular, we motivate and describe algorithms for unit-database creation on the basis of quasi-syllables and quasi-articulatory-gestures defined and parameterized purely by acoustic measurements. This approach is intended to overcome the burden of reliance on the phonetic code in concatenative speech synthesis.