Turkish LVCSR system for call center conversations

This paper presents a Turkish large vocabulary continuous speech recognition (LVCSR) system that automatically transcribes the agent-customer conversations of call centers. The aim is to increase the recognition performance of the system in order to retrieve correct statistics from the call center conversations. For this reason, a confidence metric which is calculated from the outputs of two decoders that uses different language model units is also proposed. Tested on real call center data, the system results in 22.5% word error rate on agent speech, and 40.7% word error rate on customer speech. When the proposed confidence metric is used, customer and agent word recognition rates absolutely increases up to 19.4%, and 12.3% respectively, while nearly 70% of the call data is still covered.

[1]  Geoffrey Zweig,et al.  ON THE EFFECT OFWORD ERROR RATE ON AUTOMATED QUALITY MONITORING , 2006, 2006 IEEE Spoken Language Technology Workshop.

[2]  M. Saraclar,et al.  Comparison of language modeling approaches for Turkish Broadcast News , 2008, 2008 IEEE 16th Signal Processing, Communication and Applications Conference.

[3]  Jonathan G. Fiscus,et al.  A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER) , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[4]  Jean-Luc Gauvain,et al.  Combining multiple speech recognizers using voting and language model information , 2000, INTERSPEECH.

[5]  Mathias Creutz,et al.  Unsupervised Morpheme Segmentation and Morphology Induction from Text Corpora Using Morfessor 1.0 , 2005 .

[6]  Ebru Arisoy,et al.  Analysis of the recognition errors in LVCSR of Turkish , 2009, 2009 IEEE 17th Signal Processing and Communications Applications Conference.

[7]  Richard M. Stern,et al.  The 1996 Hub-4 Sphinx-3 System , 1997 .