Software usando reconhecimento e síntese de voz: o estado da arte para o Português brasileiro

Speech is a natural interface for human-computer interaction. Speech (or voice) technology is a well-developed field when one considers the international community. There is a wide variety of academic and industrial software. The majority of them assumes a recognizer or synthesizer is available, and can be programmed through an API. In contrast, there are few resources in public domain for Brazilian Portuguese. This work discusses some of these issues and compares SAPI and JSAPI, which are APIs promoted by Microsoft and Sun, respectively. We also present two examples: a tic-tac-toe JSAPI-based game using Portuguese digits recognition and a computer-aided language learning (CALL) application using SAPI-based speech recognition in English and synthesis in Portuguese.