Speaker Identification using Spectrograms of Varying Frame Sizes

paper, a text dependent speaker recognition algorithm based on spectrogram is proposed. The spectrograms have been generated using Discrete Fourier Transform for varying frame sizes with 25% and 50% overlap between speech frames. Feature vector extraction has been done by using the row mean vector of the spectrograms. For feature matching, two distance measures, namely Euclidean distance and Manhattan distance have been used. The results have been computed using two databases: a locally created database and CSLU speaker recognition database. The maximum accuracy is 92.52% for an overlap of 50% between speech frames with Manhattan distance as similarity measure.

[1]  S. Pruzansky Pattern‐Matching Procedure for Automatic Talker Recognition , 1963 .

[2]  都築 正喜 Sound Spectrograph による音声の新表記法 , 1992 .

[3]  Hui Xiong,et al.  Euclidean Distance , 2008, Encyclopedia of GIS.

[4]  Jr. J.P. Campbell,et al.  Speaker recognition: a tutorial , 1997, Proc. IEEE.

[5]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[6]  Douglas A. Reynolds,et al.  An overview of automatic speaker recognition technology , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  W. Koenig,et al.  The Sound Spectrograph * , 2011 .

[8]  Alan V. Oppenheim,et al.  Speech spectrograms using the fast Fourier transform , 1970, IEEE Spectrum.

[9]  Tridibesh Dutta,et al.  TEXT DEPENDENT SPEAKER IDENTIFICATION BASED ON SPECTROGRAMS , 2007 .

[10]  Sadaoki Furui,et al.  Fifty years of progress in speech and speaker recognition , 2004 .

[11]  Carl Eklund,et al.  National Institute for Standards and Technology , 2009, Encyclopedia of Biometrics.

[12]  Harold F. Tipton,et al.  Handbook of Information Security Management , 1997 .

[13]  Douglas A. Reynolds,et al.  A Tutorial on Text-Independent Speaker Verification , 2004, EURASIP J. Adv. Signal Process..

[14]  J.M. Naik,et al.  Speaker verification: a tutorial , 1990, IEEE Communications Magazine.

[15]  H. B. Kekre,et al.  Performance Comparison Of 2-D DCT On Full / Block Spectrogram And 1-D DCT On Row Mean Of Spectrogram For Speaker Identification , 2010 .

[16]  E E David,et al.  Identification of a speaker by speech spectrograms. , 1969, Science.

[17]  H. B. Kekre,et al.  Speaker Identification using Row Mean of Haar and Kekre’s Transform on Spectrograms of Different Frame Sizes , 2011 .