Comparison of Recurrent Neural Networks for Slovak Punctuation Restoration

The paper proposes a punctuation restoration system based on a recurrent neural network that supplements a Slovak automatic speech recognition system. It compares methods based on long short-term memory, gated recurrent units, and bidirectional networks on the same training and evaluation set to discover which method is the best for this task. Experiments show that there are significant differences among recurrent neural network methods. Performance of the classification strongly depends on parameters of learning and size of training data. Each neural network inclines to overfitting. The best setup was found with bi-directional networks with gated recurrent units.

[1]  Yoav Goldberg,et al.  A Primer on Neural Network Models for Natural Language Processing , 2015, J. Artif. Intell. Res..

[2]  Jan Niehues,et al.  Combination of NN and CRF models for joint detection of punctuation and disfluencies , 2015, INTERSPEECH.

[3]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[4]  Gökhan Tür,et al.  Automatic detection of sentence boundaries and disfluencies based on recognized words , 1998, ICSLP.

[5]  Stefan Benus,et al.  Detecting Commas in Slovak Legal Texts , 2014, TSD.

[6]  Károly Hercegfi,et al.  Research questions on integrating user experience approaches into software development processes , 2017, 2017 8th IEEE International Conference on Cognitive Infocommunications (CogInfoCom).

[7]  Alex Graves,et al.  Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[8]  William Gale,et al.  Experiments in Character-Level Neural Network Models for Punctuation , 2017, INTERSPEECH.

[9]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[10]  Martin Lojka,et al.  Slovak Broadcast News Speech Recognition and Transcription System , 2018, NBiS.

[11]  Rafael E. Banchs,et al.  Punctuation prediction using a bidirectional recurrent neural network with part-of-speech tagging , 2017, TENCON 2017 - 2017 IEEE Region 10 Conference.

[12]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[13]  Máté Ákos Tündik,et al.  Joint Word- and Character-level Embedding CNN-RNN Models for Punctuation Restoration , 2018, 2018 9th IEEE International Conference on Cognitive Infocommunications (CogInfoCom).

[14]  Máté Ákos Tündik,et al.  Á bilingual comparison of MaxEnt-and RNN-based punctuation restoration in speech transcripts , 2017, 2017 8th IEEE International Conference on Cognitive Infocommunications (CogInfoCom).

[15]  P. Baranyi,et al.  Definition and synergies of cognitive infocommunications , 2012 .

[16]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[17]  Zachary Chase Lipton A Critical Review of Recurrent Neural Networks for Sequence Learning , 2015, ArXiv.

[18]  Eran Yahav,et al.  On the Practical Computational Power of Finite Precision RNNs for Language Recognition , 2018, ACL.

[19]  Jürgen Schmidhuber,et al.  Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.

[20]  Askars Salimbajevs Bidirectional LSTM for Automatic Punctuation Restoration , 2016, Baltic HLT.