Application of Auditory Representations on Speaker Identification

Title of Thesis: APPLICATION OF AUDITORY REPRESENTATIONS ON SPEAKER IDENTIFICATION The noise-robustness of auditory spectrum and cortical representation is examined by applying to text-independent speaker identiication tasks. A Bayes classiier residing on M-ary hypothesis test is employed to evaluate the robust-ness of the auditory cepstrum and demonstrate its superior performance to that of the well-studied mel-cepstrum. In addition, the phase feature of the wavelet-transform based multiscale cortical representation is shown much more stable than magnitude feature in characterizing speakers by correlator technique which is traditionally used in scene matching application. This observation bears con-sistence to physiological and psychoacoustic phenomena. The underlying purpose of this study is to inspect the inherent robustness of the auditory representations which are derived from a human perception based model. The experimental results indicate that the biologically motivated features signiicantly enhance the speaker identiication accuracy under noisy environments .

[1]  Richard A. Johnson,et al.  Applied Multivariate Statistical Analysis , 1983 .

[2]  O. Rioul,et al.  Wavelets and signal processing , 1991, IEEE Signal Processing Magazine.

[3]  John S. Baras,et al.  Hierarchical Wavelet Representations of Ship Radar Returns , 1993 .

[4]  Kuansan Wang,et al.  Spectral shape analysis in the central auditory system , 1995, IEEE Trans. Speech Audio Process..

[5]  Rafael C. González,et al.  Local Determination of a Moving Contrast Edge , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[7]  Fumitada Itakura,et al.  The influence of noise on the speaker recognition performance using the higher frequency band , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[8]  Shihab A. Shamma,et al.  Ripple Analysis in Ferret Primary Auditory Cortex. II. Topographic and Columnar Distribution of Ripple Response Parameters , 1994 .

[9]  Herbert Gish,et al.  Covariance estimation methods for channel robust text-independent speaker identification , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[10]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[11]  Younès Bennani,et al.  Text-independent talker identification system combining connectionist and conventional models , 1992, Neural Networks for Signal Processing II Proceedings of the 1992 IEEE Workshop.

[12]  Kuansan Wang,et al.  Self-normalization and noise-robustness in early auditory representations , 1994, IEEE Trans. Speech Audio Process..

[13]  Robert M. Gray Bayes risk-weighted vector quantization , 1994, Proceedings of 1994 Workshop on Information Theory and Statistics.

[14]  Douglas A. Reynolds,et al.  Speaker identification and verification using Gaussian mixture speaker models , 1995, Speech Commun..

[15]  Kuansan Wang,et al.  Neural computations in the auditory system for acoustical information processing , 1994 .

[16]  Kuansan Wang,et al.  Auditory representations of acoustic signals , 1992, IEEE Trans. Inf. Theory.

[17]  S. A. Shamma,et al.  Zero-Crossing and Noise Suppression in Auditory Wavelet Transformations , 1992 .

[18]  K. Wang,et al.  Auditory analysis of spectro-temporal information in acoustic signals , 1995 .

[19]  L. R. Rabiner,et al.  Text Independent Speaker Identiication System Based on Adaptive Wavelets , 1994 .

[20]  Shihab A. Shamma,et al.  Ripple Analysis in Ferret Primary Auditory Cortex. 3. Prediction of Unit Responses to Arbitrary Spectral Profiles , 1995 .

[21]  S. Shamma,et al.  Ripple Analysis in Ferret Primary Auditory Cortex. I. Response Characteristics of Single Units to Sinusoidally Rippled Spectra , 1994 .

[22]  E. Patrick,et al.  Fundamentals of Pattern Recognition , 1973 .

[23]  S. S. Stevens,et al.  The Relation of Pitch to Frequency: A Revised Scale , 1940 .

[24]  William J. Byrne,et al.  The Auditory Processing and Recognition of Speech , 1989, HLT.

[25]  M.G. Bellanger,et al.  Digital processing of speech signals , 1980, Proceedings of the IEEE.