Characterizing the SPHINX Speech Recognition System

This paper examines SPHINX, a system for speaker independent, large vocabulary, continuous speech recognition. We find that SPHINX in particular, and speech recognition systems in general, display behavior that is substantially different from the compute-bound benchmarks that have traditionally driven popular computer system design. SPHINX applies considerable load on the memory hierarchy, with a large primary working set and poor locality. In this paper we quantify these results, and correlate them with the source code, showing that they are a consequence of the algorithms used, rather than specific implementation details of the processor, or the way the application is coded. The unprecedented growth of speech recognition applications makes it imperative that system designers lend them due consideration when designing the next generation of computer systems.