Towards a continuous speech corpus for banking domain automatic speech recognition

This paper presents the work done towards developing a speech corpus for Romanian, for automatic speech recognition for the banking domain. This work is done in the context of the Speech2Process project, which aims at creating a system which allows interaction between customers and agents in the contact center much easier. The application to use the banking corpus will provide automatic response to client requests, received through voice communication protocols, in costumer support services.

[1]  Jonathan G. Fiscus,et al.  Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .

[2]  Horia Cucu,et al.  ASR for low-resourced languages: Building a phonetically balanced Romanian speech corpus , 2012, 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO).

[3]  Ondrej Dusek,et al.  Free English and Czech telephone speech corpus shared under the CC-BY-SA 3.0 license , 2014, LREC.

[4]  Carla Teixeira Lopes,et al.  TIMIT Acoustic-Phonetic Continuous Speech Corpus , 2012 .

[5]  George Suciu,et al.  Search based applications for speech processing , 2016, 2016 8th International Conference on Electronics, Computers and Artificial Intelligence (ECAI).

[6]  Sanjeev Khudanpur,et al.  Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7]  Simon King,et al.  The Romanian speech synthesis (RSS) corpus: Building a high quality HMM-based speech synthesis system using a high sampling rate , 2011, Speech Commun..

[8]  Paul Deléglise,et al.  TED-LIUM: an Automatic Speech Recognition dedicated corpus , 2012, LREC.

[9]  Zhi-gang Yin,et al.  Standardization of Speech Corpus , 2007, Data Sci. J..

[10]  Mihai-Lica Pura,et al.  MaRePhoR — An open access machine-readable phonetic dictionary for Romanian , 2017, 2017 International Conference on Speech Technology and Human-Computer Dialogue (SpeD).

[11]  Zheng-Hua Tan,et al.  Automatic speech recognition on mobile devices and over communication networks , 2008 .

[12]  Treebank Penn,et al.  Linguistic Data Consortium , 1999 .

[13]  Eugeniu Oancea,et al.  On letter to sound conversion for Romanian: A comparison of five algorithms , 2013, 2013 7th Conference on Speech Technology and Human - Computer Dialogue (SpeD).

[14]  Miruna Stanescu,et al.  Statistical phonetic analysis of the Romanian language for speech recognition and synthesis tasks , 2012, Proceedings ELMAR-2012.

[15]  Paul Lamere,et al.  Sphinx-4: a flexible open source framework for speech recognition , 2004 .

[16]  Roberto Gretter Euronews: a multilingual benchmark for ASR and LID , 2014, INTERSPEECH.

[17]  Daniel Povey,et al.  The Kaldi Speech Recognition Toolkit , 2011 .

[18]  Oliver Watts,et al.  RSS-TOBI - A Prosodically Enhanced Romanian Speech Corpus , 2014, LREC.