论文信息 - Human voice or prompt generation? can they co-exist in an application?

Human voice or prompt generation? can they co-exist in an application?

This paper describes an R&D project regarding procedures for the automatic maintenance of the interactive voice response (IVR) system of a mobile telecom operator. The original plan was to create a generic voice prompt generation system for the customer service department. The challenge was to create a solution that is hard to distinguish from the human speaker (i.e. passing a sort of Turing-test) so its output can be freely mixed with original human recordings. The domain of the solution at the first step had to be narrowed down to the price lists of available mobile phones and services. This is updated weekly, so the final operational system generates about 3 hours of speech at each weekend. It operates under human supervision but without intervention in the speech generation process. It was tested both by academic procedures and company customers and was accepted as fulfilling the original requirements.

Géza Németh | Géza Kiss | Csaba Zainkó | Gábor Olaszy | Mátyás Bartalis

[1] Géza Németh,et al. Corpus-Based Unit Selection TTS for Hungarian , 2006, TSD.

[2] Péter Olaszi. Analysis of Written and Spoken Form of Hungarian Numbers for TTS Applications , 2000, Int. J. Speech Technol..

[3] Bernd Möbius. Corpus-based speech synthesis : Methods and challenges , 2000 .

[4] Péter Tatai,et al. Phonetic transcription in automatic speech recognition , 2002 .

[5] Peter Rutten,et al. Rvoice Studio and Activeprompts , 2004, SSW.

[6] Justin Fackrell,et al. The application of interactive speech unit selection in TTS systems , 2003, INTERSPEECH.