Correction of Formal Prosodic Structures in Czech Corpora Using Legendre Polynomials

Naturalness is a very important aspect of speech synthesis that is necessary for a pleasant and undemanding listening and understanding of synthesized speech. However, in a unit selection, unexpected changes in \(F_0\) caused by units transitions can lead to an inconsistent prosody. This paper proposes a two-phased classification-based method that improves the overall prosody by correcting a formal prosodic description of speech corpora. For speech data representation, the authors decided to use Legendre polynomials.

[1]  Radek Skarnitzl,et al.  Tools rPraat and mPraat - Interfacing Phonetic Analyses with Signal Processing , 2016, TSD.

[2]  Martin Gruber,et al.  Initial Experiments on Automatic Correction of Prosodic Annotation of Large Speech Corpora , 2014, TSD.

[3]  Daniel Tihelka,et al.  Enhancements of viterbi search for fast unit selection synthesis , 2010, INTERSPEECH.

[4]  Daniel Tihelka,et al.  Is unit selection aware of audible artifacts? , 2013, SSW.

[5]  Jan Volín,et al.  On the Extension of the Formal Prosody Model for TTS , 2018, TSD.

[6]  Jindrich Matousek,et al.  Formal Prosodic Structures and Their Application in NLP , 2005, TSD.

[7]  Greg Kochanski,et al.  Connecting Intonation Labels to Mathematical Descriptions of Fundamental Frequency , 2007, Language and speech.

[8]  Daniel Tihelka,et al.  Anomaly-based annotation errors detection in TTS corpora , 2015, INTERSPEECH.

[9]  Daniel Tihelka,et al.  Unit selection and its relation to symbolic prosody: a new approach , 2006, INTERSPEECH.

[10]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[11]  Jan Romportl Structural Data-Driven Prosody Model for TTS Synthesis , 2006 .

[12]  Jan Volín,et al.  Stability of Prosodic Characteristics Across Age and Gender Groups , 2017, INTERSPEECH.

[13]  Zdenek Hanzlícek Correction of Prosodic Phrases in Large Speech Corpora , 2016, TSD.

[14]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[15]  Daniel Tihelka,et al.  Current State of Czech Text-to-Speech System ARTIC , 2006, TSD.

[16]  Zdenĕk Hanzlíă¿Ek Classification of Prosodic Phrases by Using HMMs , 2015, TSD 2015.

[17]  Martin Gruber,et al.  Current State of Text-to-Speech System ARTIC: A Decade of Research on the Field of Speech Technologies , 2018, TSD.

[18]  Markéta Juzová,et al.  Using Anomaly Detection for Fine Tuning of Formal Prosodic Structures in Speech Synthesis , 2018, TSD.

[19]  Felix Naumann,et al.  Data fusion , 2009, CSUR.

[20]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[21]  Jan Volín Extrakce základní hlasové frekvence a intonační gravitace v češtině , 2009 .

[22]  Martin Gruber,et al.  Robust Methodology for TTS Enhancement Evaluation , 2013, TSD.

[23]  Daniel Tihelka,et al.  Building of a Speech Corpus Optimised for Unit Selection TTS Synthesis , 2008, LREC.