STUDIES ON VOICE ACTIVITY DETECTION AND FEATURE DIVERSITY FOR SPEAKER RECOGNITION

[1]  Sree Hari Krishnan Parthasarathi,et al.  Robustness of phase based features for speaker recognition , 2009, INTERSPEECH.

[2]  J. Makhoul,et al.  Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.

[3]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[4]  Roberto Battiti,et al.  Using mutual information for selecting features in supervised neural net learning , 1994, IEEE Trans. Neural Networks.

[5]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[6]  Thierry Dutoit,et al.  Chirp group delay analysis of speech signals , 2007, Speech Commun..

[7]  Rajesh M. Hegde,et al.  Significance of the Modified Group Delay Feature in Speech Recognition , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Javier Ramírez,et al.  Efficient voice activity detection algorithms using long-term speech information , 2004, Speech Commun..

[9]  John W. Sammon,et al.  A Nonlinear Mapping for Data Structure Analysis , 1969, IEEE Transactions on Computers.

[10]  Kuldip K. Paliwal,et al.  Product of power spectrum and group delay function for speech recognition , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[12]  Roland Auckenthaler,et al.  Score Normalization for Text-Independent Speaker Verification Systems , 2000, Digit. Signal Process..

[13]  Wei Zhang,et al.  A soft voice activity detector based on a Laplacian-Gaussian model , 2003, IEEE Trans. Speech Audio Process..

[14]  Christophe d'Alessandro,et al.  The voice source as a causal/anticausal linear filter , 2003 .

[15]  Alan V. Oppenheim,et al.  Discrete-Time Signal Pro-cessing , 1989 .

[16]  M. W Gardner,et al.  Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences , 1998 .

[17]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[18]  Haizhou Li,et al.  GMM-SVM Kernel With a Bhattacharyya-Based Distance for Speaker Recognition , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[19]  Douglas A. Reynolds,et al.  Modeling of the glottal flow derivative waveform with application to speaker identification , 1999, IEEE Trans. Speech Audio Process..

[20]  Douglas A. Reynolds,et al.  The NIST speaker recognition evaluation - Overview, methodology, systems, results, perspective , 2000, Speech Commun..

[21]  Hema A. Murthy,et al.  Minimum phase signal derived from root cepstrum , 2003 .

[22]  Daniel Garcia-Romero,et al.  Analysis of i-vector Length Normalization in Speaker Recognition Systems , 2011, INTERSPEECH.

[23]  Sadaoki Furui,et al.  Likelihood normalization for speaker verification using a phoneme- and speaker-independent model , 1995, Speech Commun..

[24]  R. Padmanabhan,et al.  On parametric representations of the modified group delay , 2008, 2008 International Conference on Audio, Language and Image Processing.

[25]  Patrick Kenny,et al.  Speaker and Session Variability in GMM-Based Speaker Verification , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[26]  Aaron E. Rosenberg,et al.  Speaker background models for connected digit password speaker verification , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[27]  Hynek Hermansky,et al.  Qualcomm-ICSI-OGI features for ASR , 2002, INTERSPEECH.

[28]  Yuval Bistritz,et al.  Adaptive individual background model for speaker verification , 2009, INTERSPEECH.

[29]  Giuseppe Ruggeri,et al.  Performance evaluation and comparison of ITU-T/ETSI voice activity detectors , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[30]  Shrikanth S. Narayanan,et al.  Robust Voice Activity Detection Using Long-Term Signal Variability , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[31]  Rajesh M. Hegde,et al.  Dynamic selection of magnitude and phase based acoustic feature streams for speaker verification , 2009, 2009 17th European Signal Processing Conference.

[32]  Rajesh M. Hegde,et al.  Significance of Joint Features Derived from the Modified Group Delay Function in Speech Processing , 2007, EURASIP J. Audio Speech Music. Process..

[33]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[34]  Francesco Beritelli,et al.  A robust voice activity detector for wireless communications using soft computing , 1998, IEEE J. Sel. Areas Commun..

[35]  A. Kondoz,et al.  Analysis and improvement of a statistical model-based voice activity detector , 2001, IEEE Signal Processing Letters.

[36]  Zdravko Kacic,et al.  A multiconditional robust front-end feature extraction with a noise reduction procedure based on improved spectral subtraction algorithm , 2001, INTERSPEECH.

[37]  Larry P. Heck,et al.  Handset-dependent background models for robust text-independent speaker recognition , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[38]  Rafik A. Goubran,et al.  Robust voice activity detection using higher-order statistics in the LPC residual domain , 2001, IEEE Trans. Speech Audio Process..

[39]  José Tribolet,et al.  A new phase unwrapping algorithm , 1977 .

[40]  A.V. Oppenheim,et al.  The importance of phase in signals , 1980, Proceedings of the IEEE.

[41]  Haizhou Li,et al.  An overview of text-independent speaker recognition: From features to supervectors , 2010, Speech Commun..

[42]  R. Tucker,et al.  Voice activity detection using a periodicity measure , 1992 .

[43]  Bayya Yegnanarayana,et al.  Significance of group delay functions in spectrum estimation , 1992, IEEE Trans. Signal Process..

[44]  Hema A. Murthy,et al.  The modified group delay function and its application to phoneme recognition , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[45]  E. Shlomot,et al.  ITU-T Recommendation G.729 Annex B: a silence compression scheme for use with G.729 optimized for V.70 digital simultaneous voice and data applications , 1997, IEEE Commun. Mag..

[46]  Alex Acero,et al.  Spoken Language Processing: A Guide to Theory, Algorithm and System Development , 2001 .

[47]  Sridhar Krishna Nemala,et al.  The UMD-JHU 2011 speaker recognition system , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[48]  Chungyong Lee,et al.  An information-theoretic perspective on feature selection in speaker recognition , 2005, IEEE Signal Processing Letters.

[49]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[50]  Mübeccel Demirekler,et al.  Speaker identification by combining multiple classifiers using Dempster-Shafer theory of evidence , 2003, Speech Commun..

[51]  Douglas A. Reynolds,et al.  A Tutorial on Text-Independent Speaker Verification , 2004, EURASIP J. Adv. Signal Process..

[52]  Douglas E. Sturim,et al.  The MIT lincoln laboratory 2008 speaker recognition system , 2009, INTERSPEECH.

[53]  Patrick Kenny,et al.  Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[54]  Hervé Bourlard,et al.  User-customized password speaker verification using multiple reference and background models , 2006, Speech Commun..

[55]  B. Yegnanarayana,et al.  Significance of group delay functions in signal reconstruction from spectral magnitude or phase , 1984 .

[56]  Jonathan G. Fiscus,et al.  Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .

[57]  A. Stolcke,et al.  Combining feature sets with support vector machines: application to speaker recognition , 2005, IEEE Workshop on Automatic Speech Recognition and Understanding, 2005..

[58]  Wonyong Sung,et al.  A statistical model-based voice activity detection , 1999, IEEE Signal Processing Letters.

[59]  Alvin F. Martin,et al.  The DET curve in assessment of detection task performance , 1997, EUROSPEECH.

[60]  B.S. Atal,et al.  Automatic recognition of speakers from their voices , 1976, Proceedings of the IEEE.

[61]  Kuldip K. Paliwal,et al.  Short-time phase spectrum in speech processing: A review and some experimental results , 2007, Digit. Signal Process..

[62]  Sree Hari Krishnan Parthasarathi,et al.  Robustness of group delay representations for noisy speech signals , 2011, Int. J. Speech Technol..

[63]  R. Padmanabhan,et al.  Robust Voice Activity Detection using Group Delay Functions , 2006, 2006 IEEE International Conference on Industrial Technology.

[64]  Hema A. Murthy,et al.  Acoustic feature diversity and speaker verification , 2010, INTERSPEECH.

[65]  M.N.S. Swamy,et al.  An improved voice activity detection using higher order statistics , 2005, IEEE Transactions on Speech and Audio Processing.

[66]  M. Gabrea,et al.  Correlation coefficient-based voice activity detector algorithm , 2004, Canadian Conference on Electrical and Computer Engineering 2004 (IEEE Cat. No.04CH37513).

[67]  Lawrence R. Rabiner,et al.  A pattern recognition approach to voiced-unvoiced-silence classification with applications to speech recognition , 1976 .

[68]  K. Srinivasan,et al.  Voice activity detection for cellular networks , 1993, Proceedings., IEEE Workshop on Speech Coding for Telecommunications,.

[69]  Li Lee,et al.  A frequency warping approach to speaker normalization , 1998, IEEE Trans. Speech Audio Process..

[70]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[71]  Jr. J.P. Campbell,et al.  Speaker recognition: a tutorial , 1997, Proc. IEEE.

[72]  George R. Doddington,et al.  Speaker verification over long distance telephone lines , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[73]  Kuldip K. Paliwal,et al.  The importance of phase in speech enhancement , 2011, Speech Commun..

[74]  Daniel P. W. Ellis,et al.  Using mutual information to design feature combinations , 2000, INTERSPEECH.

[75]  Douglas E. Sturim,et al.  Support vector machines using GMM supervectors for speaker verification , 2006, IEEE Signal Processing Letters.

[76]  Hans-Günter Hirsch,et al.  Noise estimation techniques for robust speech recognition , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[77]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[78]  Bayya Yegnanarayana,et al.  Formant extraction from group delay function , 1991, Speech Commun..

[79]  David G. Stork,et al.  Pattern Classification , 1973 .

[80]  W·M·贝尔特曼,et al.  Speech audio process , 2011 .

[81]  Petros Maragos,et al.  Speech event detection using multiband modulation energy , 2005, INTERSPEECH.

[82]  Harry Wechsler,et al.  Detection of human speech in structured noise , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[83]  Mohamed Kamal Omar,et al.  Feature normalization for speaker verification in room reverberation , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[84]  Alvin F. Martin,et al.  The NIST speaker recognition evaluation program , 2005 .

[85]  Larry P. Heck,et al.  Robust text-independent speaker identification over telephone channels , 1999, IEEE Trans. Speech Audio Process..