Prosodic Phrase Boundary Classification Based on Czech Speech Corpora

The correct usage of phrase boundaries is an important issue for ensuring a natural sounding and easily intelligible speech. Therefore, it is not surprising that the boundary detection is also a part of text-to-speech systems. In the presented paper, large speech corpora are used for a classification based approach in order to improve the phrasing of synthesized sentences. The paper compares results of different classifiers to the deterministic approaches based on punctuation and conjunctions and shows that they are able to outperform the simple algorithms.

[1]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[2]  Daniel Tihelka,et al.  A robust multi-phase pitch-mark detection algorithm , 2007, INTERSPEECH.

[3]  Martin Gruber,et al.  Listening-Test-Based Annotation of Communicative Functions for Expressive Speech Synthesis , 2010, TSD.

[4]  Martin Gruber,et al.  Robust Methodology for TTS Enhancement Evaluation , 2013, TSD.

[5]  Jindrich Matousek,et al.  Automatic pitch-synchronous phonetic segmentation , 2008, INTERSPEECH.

[6]  Petr Pajas,et al.  TectoMT: Highly Modular MT System with Tectogrammatics Used as Transfer Layer , 2008, WMT@ACL.

[7]  Jindrich Matousek,et al.  Several Aspects of Machine-Driven Phrasing in Text-to-Speech Systems , 2011, Prague Bull. Math. Linguistics.

[8]  Daniel Tihelka,et al.  Current State of Czech Text-to-Speech System ARTIC , 2006, TSD.

[9]  Jan Romportl Automatic Prosodic Phrase Annotation in a Corpus for Speech Synthesis , 2010 .

[10]  Ilya Oparin,et al.  Robust Rule-Based Method for Automatic Break Assignment in Russian Texts , 2005, TSD.

[11]  Julia Hirschberg,et al.  Training intonational phrasing rules automatically for English and Spanish text-to-speech , 1996, Speech Commun..

[12]  Jan Romportl Prosodic Phrases and Semantic Accents in Speech Corpus for Czech TTS Synthesis , 2008, TSD.

[13]  Jan Romportl Statistical Evaluation of Prosodic Phrases in the Czech Language , 2008 .

[14]  Daniel Tihelka,et al.  Unit selection and its relation to symbolic prosody: a new approach , 2006, INTERSPEECH.

[15]  Paul Taylor,et al.  Assigning phrase breaks from part-of-speech sequences , 1997, Comput. Speech Lang..

[16]  Jindrich Matousek,et al.  Recording and Annotation of Speech Corpus for Czech Unit Selection Speech Synthesis , 2007, TSD.

[17]  Jan Romportl Structural Data-Driven Prosody Model for TTS Synthesis , 2006 .

[18]  Xuejing Sun,et al.  Intonational phrase break prediction using decision tree and n-gram model , 2001, INTERSPEECH.