论文信息 - VOICECONET: A Collaborative Framework for Speech-Based Computer Accessibility with a Case Study for Brazilian Portuguese

VOICECONET: A Collaborative Framework for Speech-Based Computer Accessibility with a Case Study for Brazilian Portuguese

In recent years, the performance of personal computers has evolved with the production of ever faster processors, a fact that enables the adoption of speech processing in computer-assisted education. There are several speech technologies that are effective in education, among which text-to-speech (TTS) and automatic speech recognition (ASR) are the most prominent. TTS systems [45] are software modules that convert natural language text into synthesized speech. ASR [18] can be seen as the TTS inverse process, in which the digitized speech signal, captured for example via a microphone, is converted into text.

Aldebaro Klautau | Nelson Neto | Pedro Batista

[1] Jorge Proença,et al. Computational Processing of the Portuguese Language , 2014, Lecture Notes in Computer Science.

[2] Paul Taylor,et al. Festival Speech Synthesis System , 1998 .

[3] S. Ramakrishnan. Modern Speech Recognition Approaches with Case Studies , 2012 .

[4] Geir Gunnarsson. Data Driven Methods in Speech Synthesis , 2005 .

[5] Patrick Silva,et al. An Open-Source Speech Recognizer for Brazilian Portuguese with a Windows Programming Interface , 2010, PROPOR.

[6] Kiyohiro Shikano,et al. Julius - an open source real-time large vocabulary recognition engine , 2001, INTERSPEECH.

[7] John J. Godfrey,et al. SWITCHBOARD: telephone speech corpus for research and development , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8] Thierry Dutoit,et al. The MBROLA project: towards a set of high quality speech synthesizers free of use for non commercial purposes , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[9] Tzu-Hua Wang,et al. Web-based dynamic assessment: Taking assessment as teaching and learning strategy for improving students' e-Learning effectiveness , 2010, Comput. Educ..

[10] António J. S. Teixeira,et al. On the Use of Machine Learning and Syllable Information in European Portuguese Grapheme-Phone Conversion , 2006, PROPOR.

[11] Gheorghe Sabau,et al. Collaborative Network for the Development of an Informational System in the SOA Context for the University Management , 2009, 2009 International Conference on Computer Technology and Development.

[12] Giuliano Antoniol,et al. Radiological Reporting Based on Voice Recognition , 1993, EWHCI.

[13] Jean-Luc Gauvain,et al. Speaker adaptation based on MAP estimation of HMM parameters , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[14] Alex Acero,et al. Spoken Language Processing , 2001 .

[15] A. A. Fidalgo-Neto,et al. The use of computers in Brazilian primary and secondary schools , 2009, Comput. Educ..

[16] Michael F. McTear,et al. Speech recognition in the secondary school classroom: an exploratory study , 1999, Comput. Educ..

[17] Oscar Saz-Torralba,et al. Tools and Technologies for Computer-Aided Speech and Language Therapy , 2009, Speech Commun..

[18] Biing-Hwang Juang,et al. Hidden Markov Models for Speech Recognition , 1991 .

[19] Aldebaro Klautau,et al. A Computer-assisted Learning Software Using Speech Synthesis and Recognition in Brazilian Portuguese , 2009 .

[20] András Kornai. Extended finite state models of language , 1996, Nat. Lang. Eng..

[21] D.C. Silva,et al. A rule-based grapheme-phone converter and stress determination for Brazilian Portuguese natural language processing , 2006, 2006 International Telecommunications Symposium.

[22] Jane Seale,et al. E-learning and accessibility: An exploration of the potential role of generic pedagogical tools , 2010, Comput. Educ..

[23] Paul Lamere,et al. Sphinx-4: a flexible open source framework for speech recognition , 2004 .

[24] Robert Agranoff,et al. Inside Collaborative Networks: Ten Lessons for Public Managers , 2006 .

[25] Alexander I. Rudnicky,et al. Pocketsphinx: A Free, Real-Time Continuous Speech Recognition System for Hand-Held Devices , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[26] Heiga Zen,et al. An HMM-Based Brazilian Portuguese Speech Synthesizer and Its Characteristics DOI: 10.14209/jcis.2006.11 , 2015 .

[27] Marc Schröder,et al. The German Text-to-Speech Synthesis System MARY: A Tool for Research, Development and Teaching , 2003, Int. J. Speech Technol..

[28] Keiichi Tokuda,et al. Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis , 1999, EUROSPEECH.

[29] Isabel Trancoso,et al. Grapheme-to-phone using finite-state transducers , 2002, Proceedings of 2002 IEEE Workshop on Speech Synthesis, 2002..

[30] Информатика,et al. Microsoft Speech API , 2010 .

[31] Maria Helena Mira Mateus. Introdução a estudos de fonologia do português brasileiro , 2000 .

[32] N. Deshmukh,et al. Hierarchical search for large-vocabulary conversational speech recognition: working toward a solution to the decoding problem , 1999 .

[33] Maxine Eskénazi,et al. An overview of spoken language technology for education , 2009, Speech Commun..

[34] Andreas Stolcke,et al. SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[35] Paul Taylor,et al. Text-to-Speech Synthesis , 2009 .

[36] Dante Barone,et al. A brazilian portuguese language corpus development , 2000, INTERSPEECH.

[37] Jelena Kovacevic,et al. Reproducible research in signal processing , 2009, IEEE Signal Process. Mag..