The Sphinx-4 speech recognition system is the latest addition to Carnegie Mellon University's repository of Sphinx speech recognition systems. It has been jointly designed by Carnegie Mellon University, Sun Microsystems Laboratories and Mitsubishi Electric Research Laboratories. It is differently designed from the earlier Sphinx systems in terms of modularity, flexibility and algorithmic aspects. It uses newer search strategies, is universal in its acceptance of various kinds of grammars and language models, types of acoustic models and feature streams. Algorithmic innovations included in the system design enable it to incorporate multiple information sources in an elegant manner. The system is entirely developed on the JavaTM platform and is highly portable, flexible, and easier to use with multithreading. This paper describes the salient features of the Sphinx-4 decoder and includes preliminary performance measures relating to speed and accuracy.
[1]
R. G. Leonard,et al.
A database for speaker-independent digit recognition
,
1984,
ICASSP.
[2]
Steve Young,et al.
Token passing: a simple conceptual model for connected speech recognition systems
,
1989
.
[3]
Alejandro Acero,et al.
Acoustical and environmental robustness in automatic speech recognition
,
1991
.
[4]
Janet M. Baker,et al.
The Design for the Wall Street Journal-based CSR Corpus
,
1992,
HLT.
[5]
Richard M. Stern,et al.
The 1996 Hub-4 Sphinx-3 System
,
1997
.