论文信息 - An open source speech synthesis module for a visual-speech recognition system

An open source speech synthesis module for a visual-speech recognition system

A Silent Speech Interface (SSI) is a voice replacement technology that permits speech communication without vocalisation. The visual-speech recognition engine of the proposed SSI is based on vocal tract imaging. The system aims to give the laryngectomised speaker the opportunity to speak with his/her original voice. This paper presents the speech synthesis module of a SSI that uses the open-source MaryTTS (Text-To-Speech). The visual-speech recognition engine of the SSI outputs a text sentence, which is imported to the speech synthesis module in order to synthesise speech in French or English. A new module of phonetic transcription has been developed and integrated into MaryTTS. In addition, English and French semi-HMM (Hidden Markov Models) model voices have been built. The SSI can be remotely controlled using a mobile device and the new voices are installed in a Web Server.

[1] Minkyu Lee. Text-to-speech systems , 2002 .

[2] F. Béchet. LIA―PHON: Un système complet de phonétisation de textes , 2001 .

[3] Lise Crevier-Buchman,et al. Silent vs vocalized articulation for a portable ultrasound-based silent speech interface , 2010, INTERSPEECH.

[4] Steve Young,et al. The HTK book , 1995 .

[5] Gérard Chollet,et al. Swiss French PolyPhone and PolyVar: telephone speech databases to model inter- and intra-speaker variability , 1996 .

[6] Gérard Chollet,et al. Visuo-phonetic decoding using multi-stream and context-dependent models for an ultrasound-based silent speech interface , 2009, INTERSPEECH.

[7] Gérard Chollet,et al. Acquisition of Ultrasound, Video and Acoustic Speech Data for a Silent-Speech Interface Application , 2008 .

[8] Gérard Chollet,et al. Towards a segmental vocoder driven by ultrasound and optical images of the tongue and lips , 2008, INTERSPEECH.

[9] C. Pelachaud,et al. GRETA. A BELIEVABLE EMBODIED CONVERSATIONAL AGENT , 2005 .

[10] Marc Schröder,et al. The German Text-to-Speech Synthesis System MARY: A Tool for Research, Development and Teaching , 2003, Int. J. Speech Technol..

[11] Stefanie Shattuck-Hufnagel,et al. The original ToBI system and the evolution of the ToBI framework , 2003 .

[12] Kiyohiro Shikano,et al. Julius - an open source real-time large vocabulary recognition engine , 2001, INTERSPEECH.

[13] J. M. Gilbert,et al. Silent speech interfaces , 2010, Speech Commun..