Feature pruning for low-power ASR systems in clean and noisy environments

Likelihood evaluation can substantially affect the total computational load for continuous hidden Markov model (HMM)-based speech-recognition systems with small vocabularies. This letter presents feature pruning , a simple yet effective technique to reduce computation and, hence, power consumption of likelihood evaluation. Our technique, under certain conditions, only evaluates the likelihoods of a fraction of feature elements and approximates those of the remaining (pruned) ones by a simple function. The order in which feature elements are evaluated is obtained by a data-driven approach to minimize computation. With this order, feature pruning can speed up the likelihood evaluation by a factor of 1.3-1.8 and reduce its power consumption by 27%-43% for various recognition tasks, including those in noisy environments.

[1]  Enrico Bocchieri,et al.  Vector quantization for the efficient computation of continuous density likelihoods , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Mark J. F. Gales,et al.  State-based Gaussian selection in large vocabulary continuous speech recognition using HMMs , 1999, IEEE Trans. Speech Audio Process..

[3]  J.H.L. Hansen,et al.  Fast likelihood computation techniques in nearest-neighbor based search for continuous speech recognition , 2001, IEEE Signal Processing Letters.

[4]  Frank Seide,et al.  Fast likelihood computation for continuous-mixture densities using a tree-based nearest neighbor search , 1995, EUROSPEECH.

[5]  Margaret Martonosi,et al.  Wattch: a framework for architectural-level power analysis and optimizations , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[6]  David Pearce,et al.  The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.

[7]  Hong C. Leung,et al.  PhoneBook: a phonetically-rich isolated-word telephone-speech database , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[8]  Climent Nadeu Camprubí,et al.  Principal and discriminant component analysis for feature selection in isolated word recognition , 1990 .

[9]  Todd M. Austin,et al.  The SimpleScalar tool set, version 2.0 , 1997, CARN.

[10]  Hsiao-Wuen Hon,et al.  Allophone clustering for continuous speech recognition , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[11]  Jerome R. Bellegarda,et al.  Tied mixture continuous parameter modeling for speech recognition , 1990, IEEE Trans. Acoust. Speech Signal Process..

[12]  Vassilios Digalakis,et al.  Efficient speech recognition using subvector quantization and discrete-mixture HMMS , 2000, Comput. Speech Lang..

[13]  Jeff A. Bilmes,et al.  Low-resource noise-robust feature post-processing on Aurora 2.0 , 2002, INTERSPEECH.

[14]  Xiao Li,et al.  Feature pruning in likelihood evaluation of HMM-based speech recognition , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).

[15]  Jan Nouza Feature selection methods for hidden Markov model-based speech recognition , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[16]  Brian Kan-Wing Mak,et al.  Subspace distribution clustering hidden Markov model , 2001, IEEE Trans. Speech Audio Process..

[17]  X. D. Huang,et al.  Semi-continuous hidden Markov models in isolated word recognition , 1988, [1988 Proceedings] 9th International Conference on Pattern Recognition.

[18]  Steve J. Young,et al.  The use of state tying in continuous speech recognition , 1993, EUROSPEECH.

[19]  Jay G. Wilpon,et al.  Discriminative feature selection for speech recognition , 1993, Comput. Speech Lang..

[20]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..