Prosodic Boundaries Prediction in Russian Using Morphological and Syntactic Features

The paper presents a comparison between three approaches towards prosodic boundary prediction in Russian text, namely a rule-governed method and methods involving statistical classifier and deep learning technique. The methods aim to predict all possible prosodic boundaries in text applying morphological and syntactic information. All used features were described in terms of Universal Dependencies framework by means of SyntaxNet parser. The rule-governed method runs in a bottom-up fashion, using the information about syntax group edges and applying data-driven and hand-written linguistic rules. For machine learning methods, conditional random fields classifier and bidirectional LSTM model were built, with such features as part-of-speech tag, syntactic dependency type, syntactic relation embedding and presence of syntactic link between the current and adjacent words. As experimental material, we used the data of CORPRES corpus, containing over 30 hours of professionally read speech. Used separately, morphological features are slightly superior to syntactic ones, and their combination improves the results. BiLSTM yields the highest F1 measure value of 90.4, as compared to 88.8 for CRF and 83.1 for rule-based method.

[1]  Antonio Bonafonte,et al.  Prosodic Break Prediction with RNNs , 2016, IberSPEECH.

[2]  Helmut Schmid,et al.  New Statistical Methods for Phrase Break Prediction , 2004, COLING.

[3]  Xuejing Sun,et al.  Intonational phrase break prediction using decision tree and n-gram model , 2001, INTERSPEECH.

[4]  Adam Nadolski,et al.  Phrase Break Prediction for Long-Form Reading TTS: Exploiting Text Structure Information , 2017, INTERSPEECH.

[5]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[6]  Sampo Pyysalo,et al.  Universal Dependencies v1: A Multilingual Treebank Collection , 2016, LREC.

[7]  Fugen Zhou,et al.  Mandarin Prosodic Phrase Prediction based on Syntactic Trees , 2016, SSW.

[8]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[9]  Slav Petrov,et al.  Globally Normalized Transition-Based Neural Networks , 2016, ACL.

[10]  Olga Khomitsevich,et al.  Using Random Forests for Prosodic Break Prediction Based on Automatic Speech Labeling , 2014, SPECOM.

[11]  Srinivas Bangalore,et al.  Intonational phrase break prediction for text-to-speech synthesis using dependency relations , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[12]  Daniil Kocharov,et al.  Combining Syntactic and Acoustic Features for Prosodic Boundary Detection in Russian , 2016, SLSP.

[13]  Olga Khomitsevich,et al.  Improving Prosodic Break Detection in a Russian TTS System , 2013, SPECOM.

[14]  Li-Rong Dai,et al.  Automatic phrase boundary labeling of speech synthesis database using context-dependent HMMs and n-gram prior distributions , 2015, INTERSPEECH.

[15]  Ji Ma,et al.  SyntaxNet Models for the CoNLL 2017 Shared Task , 2017, ArXiv.

[16]  Hao Che,et al.  Improving Mandarin prosodic boundary prediction with rich syntactic features , 2014, INTERSPEECH.

[17]  Daniil Kocharov,et al.  Prosodic boundary detection using syntactic and acoustic information , 2019, Comput. Speech Lang..

[18]  Daniil Kocharov,et al.  A Fully Annotated Corpus of Russian Speech , 2010, LREC.

[19]  Hui Zhang,et al.  Improving Mongolian Phrase Break Prediction by Using Syllable and Morphological Embeddings with BiLSTM Model , 2018, INTERSPEECH.

[20]  Stephen Cox,et al.  Stochastic and syntactic techniques for predicting phrase breaks , 2007, Comput. Speech Lang..

[21]  Johannes A Louw,et al.  Speaker specific phrase break modeling with conditional random fields for text-to-speech , 2016, 2016 Pattern Recognition Association of South Africa and Robotics and Mechatronics International Conference (PRASA-RobMech).

[22]  Joakim Nivre,et al.  Universal Stanford dependencies: A cross-linguistic typology , 2014, LREC.