Data pruning approach to unit selection for inventory generation of concatenative embeddable Chinese TTS systems

In this paper, a data pruning approach is presented for building acoustic unit inventory for syllable-based concatenative embeddable Chinese TTS system. A 3-portion segmentation of a syllable is proposed based on the nature of voiced/unvoiced structure of Chinese syllable. Individual factorial acoustic measurement of syllable is used to calculate the penalty of perceptual unsatisfactory for concatenation. With respect to the calculated penalties, bad syllables are removed from a cluster. The best syllable of each pruned cluster is selected with a compromised acoustic measurement. The evaluation and application result shows that the method is promising particularly to generate acoustic unit database for small footprint concatenative Chinese (Cantonese and Mandarin) TTS systems.