论文信息 - Feature pruning for low-power ASR systems in clean and noisy environments

Feature pruning for low-power ASR systems in clean and noisy environments

Likelihood evaluation can substantially affect the total computational load for continuous hidden Markov model (HMM)-based speech-recognition systems with small vocabularies. This letter presents feature pruning , a simple yet effective technique to reduce computation and, hence, power consumption of likelihood evaluation. Our technique, under certain conditions, only evaluates the likelihoods of a fraction of feature elements and approximates those of the remaining (pruned) ones by a simple function. The order in which feature elements are evaluated is obtained by a data-driven approach to minimize computation. With this order, feature pruning can speed up the likelihood evaluation by a factor of 1.3-1.8 and reduce its power consumption by 27%-43% for various recognition tasks, including those in noisy environments.

Xiao Li | Jeff A. Bilmes

[1] Enrico Bocchieri,et al. Vector quantization for the efficient computation of continuous density likelihoods , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2] Mark J. F. Gales,et al. State-based Gaussian selection in large vocabulary continuous speech recognition using HMMs , 1999, IEEE Trans. Speech Audio Process..

[3] J.H.L. Hansen,et al. Fast likelihood computation techniques in nearest-neighbor based search for continuous speech recognition , 2001, IEEE Signal Processing Letters.

[4] Frank Seide,et al. Fast likelihood computation for continuous-mixture densities using a tree-based nearest neighbor search , 1995, EUROSPEECH.

[5] Margaret Martonosi,et al. Wattch: a framework for architectural-level power analysis and optimizations , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[6] David Pearce,et al. The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.

[7] Hong C. Leung,et al. PhoneBook: a phonetically-rich isolated-word telephone-speech database , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[8] Climent Nadeu Camprubí,et al. Principal and discriminant component analysis for feature selection in isolated word recognition , 1990 .

[9] Todd M. Austin,et al. The SimpleScalar tool set, version 2.0 , 1997, CARN.

[10] Hsiao-Wuen Hon,et al. Allophone clustering for continuous speech recognition , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[11] Jerome R. Bellegarda,et al. Tied mixture continuous parameter modeling for speech recognition , 1990, IEEE Trans. Acoust. Speech Signal Process..

[12] Vassilios Digalakis,et al. Efficient speech recognition using subvector quantization and discrete-mixture HMMS , 2000, Comput. Speech Lang..

[13] Jeff A. Bilmes,et al. Low-resource noise-robust feature post-processing on Aurora 2.0 , 2002, INTERSPEECH.

[14] Xiao Li,et al. Feature pruning in likelihood evaluation of HMM-based speech recognition , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).

[15] Jan Nouza. Feature selection methods for hidden Markov model-based speech recognition , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[16] Brian Kan-Wing Mak,et al. Subspace distribution clustering hidden Markov model , 2001, IEEE Trans. Speech Audio Process..

[17] X. D. Huang,et al. Semi-continuous hidden Markov models in isolated word recognition , 1988, [1988 Proceedings] 9th International Conference on Pattern Recognition.

[18] Steve J. Young,et al. The use of state tying in continuous speech recognition , 1993, EUROSPEECH.

[19] Jay G. Wilpon,et al. Discriminative feature selection for speech recognition , 1993, Comput. Speech Lang..

[20] Isabelle Guyon,et al. An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..