A novel approach for Automatic Speaker Identification (ASI) employing Waveform based signal representation in multiple domains is presented. The proposed approach involves two stages, namely, the encoding stage, and the decoding stage. During the encoding stage (training mode), mixed transform coding, in conjunction with split vector Quantization (MTSVQ) is employed to form representative codebooks for each speaker. During the decoding stage (running mode), the vectors that best represent the unknown input vector are selected to represent the speech vectors. A normalised matching accuracy measure is developed to evaluate the proposed algorithm's performance. The resulting technique is consistently found to obtain enhanced ASI accuracy in comparison with the earlier approaches as vector quantization employing single transform domains.
[1]
J. N. Gowdy,et al.
Feature extraction using discrete wavelet transform for speech recognition
,
2000,
Proceedings of the IEEE SoutheastCon 2000. 'Preparing for The New Millennium' (Cat. No.00CH37105).
[2]
Allen Gersho,et al.
Vector quantization and signal compression
,
1991,
The Kluwer international series in engineering and computer science.
[3]
Wasfy B. Mikhael,et al.
A survey of mixed transform techniques for speech and image coding
,
1999,
ISCAS'99. Proceedings of the 1999 IEEE International Symposium on Circuits and Systems VLSI (Cat. No.99CH36349).
[4]
W. B. Mikhael,et al.
Multiple transform domain split vector quantisation
,
2001
.