This paper presents some initial results from an analysis of performance of a voice command interpretation and authorisation system using voiceprint to identify the human-commander. Two approaches based on human voice related algorithms are proposed. Mel-frequency cepstral coefficient (MFCC) and perceptual linear predictive (PLP) are two feature extraction methods that are closely mimic the human auditory system. The two methods were applied to the proposed system to determine their suitability for use in a commander recognition system. Vector Quantization (VQ) with Linde-Buzo-Gray (LBG) iterative algorithm was used for clustering for the classification of commanders. The performance of the algorithms was evaluated to compare between two methods in MATLAB simulation environment based on, false rejection rate (rejecting an authorised commander), false acceptance rate (accepting unauthorised commander) and the execution time. Based on the initial results, both methods achieved accurate classification and PLP method has shown better execution time and lower false-acceptance rate compared to the MFCC. The combined approach (MFCC-PLP) did not show considerable improved performances to the individual feature models PLP and MFCC without incurring high computational costs that will compromise the performance of the speaker recognition tasks. Therefore, PLP method is the best candidate for command-recognition system to be developed in the second phase of this research.
[1]
Hermann Ney,et al.
Computing Mel-frequency cepstral coefficients on the power spectrum
,
2001,
2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[2]
Biing-Hwang Juang,et al.
Fundamentals of speech recognition
,
1993,
Prentice Hall signal processing series.
[3]
Biing-Hwang Juang,et al.
A vector quantization approach to speaker recognition
,
1985,
ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[4]
H Hermansky,et al.
Perceptual linear predictive (PLP) analysis of speech.
,
1990,
The Journal of the Acoustical Society of America.
[5]
Jr. J.P. Campbell,et al.
Speaker recognition: a tutorial
,
1997,
Proc. IEEE.
[6]
Ronald W. Schafer,et al.
Theory and Applications of Digital Speech Processing
,
2010
.
[7]
Sun-Yuan Kung,et al.
Biometric Authentication: A Machine Learning Approach
,
2004
.
[8]
Theresa C McLoud,et al.
Voice Recognition
,
2009,
Encyclopedia of Biometrics.
[9]
Robert M. Gray,et al.
An Algorithm for Vector Quantizer Design
,
1980,
IEEE Trans. Commun..
[10]
Douglas A. Reynolds,et al.
An overview of automatic speaker recognition technology
,
2002,
2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.