Speech Processing Background

In this chapter, we review some of the building blocks of speech processing systems. We then discuss the specifics of speaker verification, speaker identification, and speech recognition. We will reuse these constructions when designing privacy-preserving algorithms for these tasks in the reminder of the thesis.

[1]  Kristen Grauman,et al.  Kernelized locality-sensitive hashing for scalable image search , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[2]  William M. Campbell,et al.  Generalized linear discriminant sequence kernels for speaker recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Douglas A. Reynolds,et al.  Approaches to Speaker Detection and Tracking in Conversational Speech , 2000, Digit. Signal Process..

[4]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[5]  Douglas E. Sturim,et al.  SVM Based Speaker Verification using a GMM Supervector Kernel and NAP Variability Compensation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[6]  Larry P. Heck,et al.  Handset-dependent background models for robust text-independent speaker recognition , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  Tom E. Bishop,et al.  Blind Image Restoration Using a Block-Stationary Signal Model , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[8]  Moses Charikar,et al.  Similarity estimation techniques from rounding algorithms , 2002, STOC '02.

[9]  Michael J. Carey,et al.  A speaker verification system using alpha-nets , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[10]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[11]  Patrick Kenny,et al.  Experiments in speaker verification using factor analysis likelihood ratios , 2004, Odyssey.

[12]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[13]  Samy Bengio,et al.  Kernel Based Text-Independnent Speaker Verification , 2009 .

[14]  Sadaoki Furui,et al.  Likelihood normalization for speaker verification using a phoneme- and speaker-independent model , 1995, Speech Commun..

[15]  Aaron E. Rosenberg,et al.  Speaker background models for connected digit password speaker verification , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[16]  Douglas A. Reynolds,et al.  A Tutorial on Text-Independent Speaker Verification , 2004, EURASIP J. Adv. Signal Process..

[17]  Patrick Kenny,et al.  Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[18]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[19]  Douglas A. Reynolds,et al.  Comparison of background normalization methods for text-independent speaker verification , 1997, EUROSPEECH.