Moving speech recognition from software to silicon: the in silico vox project

To achieve much faster decoding, or much lower power consumption, we need to liberate speech recognition from the artificial constraints of its current software-only form, and move the essential computations directly into silicon. There are vast efficiencies waiting to be unlocked in this application – we need the proper architecture to do so. We report results from a firstgeneration hardware architecture simulated at bit-level, and partial, working FPGA-based prototypes. Simulation results show that rather modest hardware designs, running 10-20X slower than conventional processors, can already decode at 0.6 xRT, running the standard 5K Wall Street Journal benchmark.

[1]  Doug Burger,et al.  Characterizing the SPHINX Speech Recognition System , 2001 .

[2]  H. Hon A survey of hardware architectures designed for speech recognition , 1991 .

[3]  Todd M. Austin,et al.  The SimpleScalar tool set, version 2.0 , 1997, CARN.

[4]  John Wawrzynek,et al.  BEE2: a high-end reconfigurable computing system , 2005, IEEE Design & Test of Computers.

[5]  Eric A. Brewer,et al.  Hardware speech recognition for user interfaces in low cost, low power devices , 2005, Proceedings. 42nd Design Automation Conference, 2005..

[6]  Mark D. Hill,et al.  Cache performance for selected SPEC CPU2000 benchmarks , 2001, CARN.

[7]  R. Cabeza,et al.  Present and Future , 2008 .

[8]  David Pallett,et al.  A look at NIST'S benchmark ASR tests: past, present, and future , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).

[9]  Scott A. Mahlke,et al.  Architectural optimizations for low-power, real-time speech recognition , 2003, CASES '03.

[10]  Robert W. Brodersen Low voltage design for portable systems , 2002 .

[11]  Shekhar Borkar Low-Voltage Design for Portable Systems , 2002 .

[12]  Scott Mahlke,et al.  Insights into the Memory Demands of Speech Recognition Algorithms , 2002 .

[13]  Zhen Fang,et al.  A low-power accelerator for the SPHINX 3 speech recognition system , 2003, CASES '03.

[14]  Jan M. Rabaey,et al.  Integrated circuits for a real-time large-vocabulary continuous speech recognition system , 1991 .