Biometric speaker classification

We address the problem of classification of speakers based on measurements of features obtained from their speech. The process is an adaption of biometric methods used to identify people. The process for speech differs since speech is not stationary. We therefore propose the classification of speakers b y the statistical distributions of parameters which may be accurately estimated by modern signal processing techniques. The intent is to develop a speaker clustering algorithm which is dependent of transmission channel and insensitive to language variations, and which may be re-trained, with minimal data, to include a new speaker. We demonstrate effectiveness on the problem of identification of the speakers gender, and present evidence that the methods may be extended to the general problem of speaker identification.

[1]  Leon Cohen,et al.  Scale-transform-based features for application in speech recognition , 1999, Optics & Photonics.

[2]  David C. Smith,et al.  A multivariate speech activity detector based on the syllable rate , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[3]  Leon Cohen,et al.  Fitting the Mel scale , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[4]  Douglas J. Nelson,et al.  Pitch-based methods for speech detection and automatic frequency recovery , 1995, Optics & Photonics.

[5]  Douglas J. Nelson,et al.  Multivariate speech activity dector based on the syllable rate , 1998, Optics & Photonics.

[6]  Michael J. Carey,et al.  Language independent gender identification , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[7]  Leon Cohen,et al.  Frequency-warping and speaker-normalization , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8]  Leon Cohen,et al.  Frequency-warping in speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[9]  Douglas Nelson,et al.  Glottal pulse alignment in voiced speech for pitch determination , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10]  Leon Cohen,et al.  Improved scale-cepstral analysis in speech , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[11]  Douglas J. Nelson,et al.  Cross-spectral methods with an application to speech processing , 1999, Optics & Photonics.

[12]  Douglas J. Nelson,et al.  The NP speech activity detection algorithm , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[13]  Douglas J. Nelson,et al.  Separation of non-spontaneous and spontaneous speech , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).