Speaker recognition from whispered speech: A tutorial survey and an application of time-varying linear prediction

[1]  Rosa González Hautamäki,et al.  Acoustical and perceptual study of voice disguise by age modification in speaker verification , 2017, Speech Commun..

[2]  Tiago H. Falk,et al.  Fusion of auditory inspired amplitude modulation spectrum and cepstral features for whispered and normal speech speaker verification , 2017, Comput. Speech Lang..

[3]  Paavo Alku,et al.  Time-Varying Autoregressions for Speaker Verification in Reverberant Conditions , 2017, INTERSPEECH.

[4]  Rajib Sharma,et al.  Analysis of the Intrinsic Mode Functions for Speaker Information , 2017, Speech Commun..

[5]  Lukás Burget,et al.  HMM-Based Phrase-Independent i-Vector Extractor for Text-Dependent Speaker Verification , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[6]  Mahesh Kumar Nandwana,et al.  Analysis of human scream and its impact on text-independent speaker verification. , 2017, The Journal of the Acoustical Society of America.

[7]  Petr Motlícek,et al.  Template-matching for text-dependent speaker verification , 2017, Speech Commun..

[8]  M. Picheny,et al.  Comparison of Parametric Representation for Monosyllabic Word Recognition in Continuously Spoken Sentences , 2017 .

[9]  Tomi Kinnunen,et al.  Further optimisations of constant Q cepstral processing for integrated utterance and text-dependent speaker verification , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).

[10]  Sanjeev Khudanpur,et al.  Deep neural network-based speaker embeddings for end-to-end speaker verification , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).

[11]  Lukás Burget,et al.  i-Vector/HMM Based Text-Dependent Speaker Verification System for RedDots Challenge , 2016, INTERSPEECH.

[12]  Thomas Fang Zheng,et al.  Improving Short Utterance Speaker Recognition by Modeling Speech Unit Classes , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[13]  Douglas D. O'Shaughnessy,et al.  Feature mapping, score-, and feature-level fusion for improved normal and whispered speech speaker verification , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14]  Georg Heigold,et al.  End-to-end text-dependent speaker verification , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[15]  Paavo Alku,et al.  Feature Extraction Using Power-Law Adjusted Linear Prediction With Application to Speaker Recognition Under Severe Vocal Effort Mismatch , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[16]  W. Heeren Vocalic correlates of pitch in whispered versus normal speech. , 2015, The Journal of the Acoustical Society of America.

[17]  Milton Sarria-Paja,et al.  Strategies to Enhance Whispered Speech Speaker Verification: A Comparative Analysis , 2015 .

[18]  John H. L. Hansen,et al.  Speaker Recognition by Machines and Humans: A tutorial review , 2015, IEEE Signal Processing Magazine.

[19]  Ya Zhang,et al.  Deep feature for text-dependent speaker verification , 2015, Speech Commun..

[20]  John H. L. Hansen,et al.  Mean Hilbert envelope coefficients (MHEC) for robust speaker and language identification , 2015, Speech Commun..

[21]  Tiago H. Falk,et al.  The effects of whispered speech on state-of-the-art voice based biometrics systems , 2015, 2015 IEEE 28th Canadian Conference on Electrical and Computer Engineering (CCECE).

[22]  Hynek Hermansky,et al.  Robust Feature Extraction Using Modulation Filtering of Autoregressive Models , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[23]  Tomi Kinnunen,et al.  From single to multiple enrollment i-vectors: Practical PLDA scoring variants for speaker verification , 2014, Digit. Signal Process..

[24]  Paavo Alku,et al.  Mixture Linear Prediction in Speaker Verification Under Vocal Effort Mismatch , 2014, IEEE Signal Processing Letters.

[25]  Bin Ma,et al.  A whispered Mandarin corpus for speech technology applications , 2014, INTERSPEECH.

[26]  Douglas D. O'Shaughnessy,et al.  Whispered speaker verification and gender detection using weighted instantaneous frequencies , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[27]  H. Masthoff A report on a voice disguise experiment , 2013 .

[28]  N. P. Jawarkar,et al.  Speaker Identification Using Whispered Speech , 2013, 2013 International Conference on Communication Systems and Network Technologies.

[29]  John H. L. Hansen,et al.  Acoustic analysis and feature transformation from neutral to whisper for speaker identification within whispered speech audio streams , 2013, Speech Commun..

[30]  Tomi Kinnunen,et al.  Effect of multicondition training on i-vector PLDA configurations for speaker recognition , 2013, INTERSPEECH.

[31]  Yun Lei,et al.  Robust feature front-end for speaker identification , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[32]  Daniel Garcia-Romero,et al.  Multicondition training of Gaussian PLDA models in i-vector space for noise and reverberation robust speaker recognition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[33]  John H. L. Hansen,et al.  Speaker Identification Within Whispered Speech Audio Streams , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[34]  Patrick Kenny,et al.  Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[35]  D Rudoy,et al.  Time-Varying Autoregressions in Speech: Detection Theory and Applications , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[36]  Patricia A. Keating,et al.  Voicesauce: A Program for Voice Analysis , 2009, ICPhS.

[37]  Boon Pang Lim,et al.  Computational differences between whispered and non-whispered speech , 2011 .

[38]  John H. L. Hansen,et al.  Speaker Identification for Whispered Speech Using a Training Feature Transformation from Neutral to Whisper , 2011, INTERSPEECH.

[39]  Björn Schuller,et al.  The Munich 2011 CHiME Challenge Contribution: NMF-BLSTM Speech Enhancement and Recognition for Reverberated Multisource Environments , 2011, Interspeech 2011.

[40]  John H. L. Hansen,et al.  Acoustic analysis for speaker identification of whispered speech , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[41]  Patrick Kenny,et al.  Bayesian Speaker Verification with Heavy-Tailed Priors , 2010, Odyssey.

[42]  Haizhou Li,et al.  An overview of text-independent speaker recognition: From features to supervectors , 2010, Speech Commun..

[43]  John H. L. Hansen,et al.  Speaker identification with whispered speech based on modified LFCC parameters and feature mapping , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[44]  John H. L. Hansen,et al.  Speaker identification for whispered speech using modified temporal patterns and MFCCs , 2009, INTERSPEECH.

[45]  John H. L. Hansen,et al.  Advancements in whisper-island detection within normally phonated audio streams , 2009, INTERSPEECH.

[46]  Hynek Hermansky,et al.  Recognition of Reverberant Speech Using Frequency Domain Linear Prediction , 2008, IEEE Signal Processing Letters.

[47]  Fred Cummins,et al.  Speaker Identification Using Instantaneous Frequencies , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[48]  John H. L. Hansen,et al.  Speaker identification for whispered speech based on frequency warping and score competition , 2008, INTERSPEECH.

[49]  James H. Elder,et al.  Probabilistic Linear Discriminant Analysis for Inferences About Identity , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[50]  John H. L. Hansen,et al.  Analysis and classification of speech mode: whispered through shouted , 2007, INTERSPEECH.

[51]  Daniel P. W. Ellis,et al.  Autoregressive Modeling of Temporal Envelopes , 2007, IEEE Transactions on Signal Processing.

[52]  Tanja Schultz,et al.  Whispering Speaker Identification , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[53]  Xin Wang,et al.  Laplacian Operator-Based Edge Detectors , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[54]  Juraj Simko,et al.  The CHAINS corpus: CHAracterizing INdividual Speakers , 2006 .

[55]  Patrick Kenny,et al.  Joint Factor Analysis of Speaker and Session Variability: Theory and Algorithms , 2006 .

[56]  Kazuya Takeda,et al.  Analysis and recognition of whispered speech , 2005, Speech Commun..

[57]  Daniel P. W. Ellis,et al.  PLP2: Autoregressive modeling of auditory-like 2-D spectro-temporal patterns , 2004 .

[58]  Samy Bengio,et al.  A statistical significance test for person authentication , 2004, Odyssey.

[59]  Daniel P. W. Ellis,et al.  PLP-squared: autoregressive modeling of auditory-like 2-d spectro-temporal patterns , 2004, SAPA@INTERSPEECH.

[60]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[61]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[62]  H. Künzel Effects of voice disguise on speaking fundamental frequency , 2000 .

[63]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[64]  R. Kumaresan,et al.  Model-based approach to envelope and positive instantaneous frequency estimation of signals with speech applications , 1999 .

[65]  Bhaskar D. Rao,et al.  All-pole modeling of speech based on the minimum variance distortionless response spectrum , 2000, Conference Record of the Thirty-First Asilomar Conference on Signals, Systems and Computers (Cat. No.97CB36136).

[66]  James David Johnston,et al.  Enhancing the Performance of Perceptual Audio Coders by Using Temporal Noise Shaping (TNS) , 1996 .

[67]  H. Takahashi,et al.  Perceived pitch of whispered vowels--relationship with formant frequencies: a preliminary study. , 1996, Journal of voice : official journal of the Voice Foundation.

[68]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[69]  Hynek Hermansky,et al.  RASTA processing of speech , 1994, IEEE Trans. Speech Audio Process..

[70]  Jean-Luc Gauvain,et al.  Speaker adaptation based on MAP estimation of HMM parameters , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[71]  J C Junqua,et al.  The Lombard reflex and its role on human listeners and automatic speech recognizers. , 1993, The Journal of the Acoustical Society of America.

[72]  V. Tartter What's in a whisper? , 1989, The Journal of the Acoustical Society of America.

[73]  Yves Grenier,et al.  Time-dependent ARMA modeling of nonstationary signals , 1983 .

[74]  A. Willsky,et al.  Time-varying parametric modeling of speech☆ , 1983 .

[75]  Alan Oppenheim,et al.  Time-varying parametric modeling of speech , 1977, 1977 IEEE Conference on Decision and Control including the 16th Symposium on Adaptive Processes and A Special Symposium on Fuzzy Set Theory and Applications.

[76]  L. A. Liporace Linear estimation of nonstationary signals. , 1975, The Journal of the Acoustical Society of America.

[77]  J. Makhoul,et al.  Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.