A signal subspace approach for speech modelling and classification

In this paper, a speech classifier inspired by the signal subspace approach is developed. A novel signal subspace speech model is initially obtained via a rank reducing subspace decomposition algorithm that is based on the SVD. Motivated by the assumption that the speech signal comprises of short term dynamics that are slowly changing, it follows that the signal subspace of the speech signal is likewise slowly changing. The proposed signal subspace model aims to characterize the subspace dynamics using a family of subspace trajectories. In particular, each subspace trajectory is a sequence of vectors that traces the dynamics of a rank-one subspace in time. An assembly of these trajectories, henceforth, specifies the progression of the embedded signal subspace. To construct the signal subspace classifier, prototype elements in the form of the signal subspace models are determined for every signal class. A minimum-distance rule with a distance measure that resembles an energy difference function is subsequently applied in the actual classification task. Simulation of the proposed signal subspace classifier in an isolated digit speech recognition problem reveals promising results.

[1]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[2]  Yoh'ichi Tohkura,et al.  A weighted cepstral distance measure for speech recognition , 1987, IEEE Trans. Acoust. Speech Signal Process..

[3]  Biing-Hwang Juang,et al.  A family of distortion measures based upon projection operation for robust speech recognition , 1989, IEEE Trans. Acoust. Speech Signal Process..

[4]  J. Makhoul,et al.  Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.

[5]  Y. Ephraim,et al.  Extension of the signal subspace speech enhancement approach to colored noise , 2003, IEEE Signal Processing Letters.

[6]  Donald W. Tufts,et al.  Two algorithms for fast approximate subspace tracking , 1999, IEEE Trans. Signal Process..

[7]  P. Laguna,et al.  Signal Processing , 2002, Yearbook of Medical Informatics.

[8]  Yariv Ephraim,et al.  A signal subspace approach for speech enhancement , 1995, IEEE Trans. Speech Audio Process..

[9]  Joseph Picone,et al.  Signal modeling techniques in speech recognition , 1993, Proc. IEEE.

[10]  F. Itakura,et al.  Minimum prediction residual principle applied to speech recognition , 1975 .

[11]  Søren Holdt Jensen,et al.  Reduction of broad-band noise in speech by truncated QSVD , 1995, IEEE Trans. Speech Audio Process..

[12]  Peter Søren Kirk Hansen,et al.  Signal subspace methods for speech enhancement , 1998 .

[13]  M.G. Bellanger,et al.  Digital processing of speech signals , 1980, Proceedings of the IEEE.

[14]  Yunxin Zhao,et al.  Energy-constrained signal subspace method for speech enhancement and recognition , 1997, IEEE Signal Processing Letters.

[15]  A. Oppenheim,et al.  Homomorphic analysis of speech , 1968 .

[16]  Ronald R. Coifman,et al.  Entropy-based algorithms for best basis selection , 1992, IEEE Trans. Inf. Theory.

[17]  Patrick Wambacq,et al.  Assessment of signal subspace based speech enhancement for noise robust speech recognition , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[18]  Ronald R. Coifman,et al.  Local discriminant bases , 1994, Optics & Photonics.

[19]  Bart De Moor,et al.  The singular value decomposition and long and short spaces of noisy matrices , 1993, IEEE Trans. Signal Process..