Speaker recognition from whispered speech: A tutorial survey and an application of time-varying linear prediction
暂无分享,去创建一个
Paavo Alku | Tomi Kinnunen | Ville Vestman | Dhananjaya N. Gowda | Md. Sahidullah | Ville Vestman | T. Kinnunen | P. Alku | Md. Sahidullah
[1] Rosa González Hautamäki,et al. Acoustical and perceptual study of voice disguise by age modification in speaker verification , 2017, Speech Commun..
[2] Tiago H. Falk,et al. Fusion of auditory inspired amplitude modulation spectrum and cepstral features for whispered and normal speech speaker verification , 2017, Comput. Speech Lang..
[3] Paavo Alku,et al. Time-Varying Autoregressions for Speaker Verification in Reverberant Conditions , 2017, INTERSPEECH.
[4] Rajib Sharma,et al. Analysis of the Intrinsic Mode Functions for Speaker Information , 2017, Speech Commun..
[5] Lukás Burget,et al. HMM-Based Phrase-Independent i-Vector Extractor for Text-Dependent Speaker Verification , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[6] Mahesh Kumar Nandwana,et al. Analysis of human scream and its impact on text-independent speaker verification. , 2017, The Journal of the Acoustical Society of America.
[7] Petr Motlícek,et al. Template-matching for text-dependent speaker verification , 2017, Speech Commun..
[8] M. Picheny,et al. Comparison of Parametric Representation for Monosyllabic Word Recognition in Continuously Spoken Sentences , 2017 .
[9] Tomi Kinnunen,et al. Further optimisations of constant Q cepstral processing for integrated utterance and text-dependent speaker verification , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).
[10] Sanjeev Khudanpur,et al. Deep neural network-based speaker embeddings for end-to-end speaker verification , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).
[11] Lukás Burget,et al. i-Vector/HMM Based Text-Dependent Speaker Verification System for RedDots Challenge , 2016, INTERSPEECH.
[12] Thomas Fang Zheng,et al. Improving Short Utterance Speaker Recognition by Modeling Speech Unit Classes , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[13] Douglas D. O'Shaughnessy,et al. Feature mapping, score-, and feature-level fusion for improved normal and whispered speech speaker verification , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Georg Heigold,et al. End-to-end text-dependent speaker verification , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Paavo Alku,et al. Feature Extraction Using Power-Law Adjusted Linear Prediction With Application to Speaker Recognition Under Severe Vocal Effort Mismatch , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[16] W. Heeren. Vocalic correlates of pitch in whispered versus normal speech. , 2015, The Journal of the Acoustical Society of America.
[17] Milton Sarria-Paja,et al. Strategies to Enhance Whispered Speech Speaker Verification: A Comparative Analysis , 2015 .
[18] John H. L. Hansen,et al. Speaker Recognition by Machines and Humans: A tutorial review , 2015, IEEE Signal Processing Magazine.
[19] Ya Zhang,et al. Deep feature for text-dependent speaker verification , 2015, Speech Commun..
[20] John H. L. Hansen,et al. Mean Hilbert envelope coefficients (MHEC) for robust speaker and language identification , 2015, Speech Commun..
[21] Tiago H. Falk,et al. The effects of whispered speech on state-of-the-art voice based biometrics systems , 2015, 2015 IEEE 28th Canadian Conference on Electrical and Computer Engineering (CCECE).
[22] Hynek Hermansky,et al. Robust Feature Extraction Using Modulation Filtering of Autoregressive Models , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[23] Tomi Kinnunen,et al. From single to multiple enrollment i-vectors: Practical PLDA scoring variants for speaker verification , 2014, Digit. Signal Process..
[24] Paavo Alku,et al. Mixture Linear Prediction in Speaker Verification Under Vocal Effort Mismatch , 2014, IEEE Signal Processing Letters.
[25] Bin Ma,et al. A whispered Mandarin corpus for speech technology applications , 2014, INTERSPEECH.
[26] Douglas D. O'Shaughnessy,et al. Whispered speaker verification and gender detection using weighted instantaneous frequencies , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[27] H. Masthoff. A report on a voice disguise experiment , 2013 .
[28] N. P. Jawarkar,et al. Speaker Identification Using Whispered Speech , 2013, 2013 International Conference on Communication Systems and Network Technologies.
[29] John H. L. Hansen,et al. Acoustic analysis and feature transformation from neutral to whisper for speaker identification within whispered speech audio streams , 2013, Speech Commun..
[30] Tomi Kinnunen,et al. Effect of multicondition training on i-vector PLDA configurations for speaker recognition , 2013, INTERSPEECH.
[31] Yun Lei,et al. Robust feature front-end for speaker identification , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[32] Daniel Garcia-Romero,et al. Multicondition training of Gaussian PLDA models in i-vector space for noise and reverberation robust speaker recognition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[33] John H. L. Hansen,et al. Speaker Identification Within Whispered Speech Audio Streams , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[34] Patrick Kenny,et al. Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[35] D Rudoy,et al. Time-Varying Autoregressions in Speech: Detection Theory and Applications , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[36] Patricia A. Keating,et al. Voicesauce: A Program for Voice Analysis , 2009, ICPhS.
[37] Boon Pang Lim,et al. Computational differences between whispered and non-whispered speech , 2011 .
[38] John H. L. Hansen,et al. Speaker Identification for Whispered Speech Using a Training Feature Transformation from Neutral to Whisper , 2011, INTERSPEECH.
[39] Björn Schuller,et al. The Munich 2011 CHiME Challenge Contribution: NMF-BLSTM Speech Enhancement and Recognition for Reverberated Multisource Environments , 2011, Interspeech 2011.
[40] John H. L. Hansen,et al. Acoustic analysis for speaker identification of whispered speech , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[41] Patrick Kenny,et al. Bayesian Speaker Verification with Heavy-Tailed Priors , 2010, Odyssey.
[42] Haizhou Li,et al. An overview of text-independent speaker recognition: From features to supervectors , 2010, Speech Commun..
[43] John H. L. Hansen,et al. Speaker identification with whispered speech based on modified LFCC parameters and feature mapping , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[44] John H. L. Hansen,et al. Speaker identification for whispered speech using modified temporal patterns and MFCCs , 2009, INTERSPEECH.
[45] John H. L. Hansen,et al. Advancements in whisper-island detection within normally phonated audio streams , 2009, INTERSPEECH.
[46] Hynek Hermansky,et al. Recognition of Reverberant Speech Using Frequency Domain Linear Prediction , 2008, IEEE Signal Processing Letters.
[47] Fred Cummins,et al. Speaker Identification Using Instantaneous Frequencies , 2008, IEEE Transactions on Audio, Speech, and Language Processing.
[48] John H. L. Hansen,et al. Speaker identification for whispered speech based on frequency warping and score competition , 2008, INTERSPEECH.
[49] James H. Elder,et al. Probabilistic Linear Discriminant Analysis for Inferences About Identity , 2007, 2007 IEEE 11th International Conference on Computer Vision.
[50] John H. L. Hansen,et al. Analysis and classification of speech mode: whispered through shouted , 2007, INTERSPEECH.
[51] Daniel P. W. Ellis,et al. Autoregressive Modeling of Temporal Envelopes , 2007, IEEE Transactions on Signal Processing.
[52] Tanja Schultz,et al. Whispering Speaker Identification , 2007, 2007 IEEE International Conference on Multimedia and Expo.
[53] Xin Wang,et al. Laplacian Operator-Based Edge Detectors , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[54] Juraj Simko,et al. The CHAINS corpus: CHAracterizing INdividual Speakers , 2006 .
[55] Patrick Kenny,et al. Joint Factor Analysis of Speaker and Session Variability: Theory and Algorithms , 2006 .
[56] Kazuya Takeda,et al. Analysis and recognition of whispered speech , 2005, Speech Commun..
[57] Daniel P. W. Ellis,et al. PLP2: Autoregressive modeling of auditory-like 2-D spectro-temporal patterns , 2004 .
[58] Samy Bengio,et al. A statistical significance test for person authentication , 2004, Odyssey.
[59] Daniel P. W. Ellis,et al. PLP-squared: autoregressive modeling of auditory-like 2-d spectro-temporal patterns , 2004, SAPA@INTERSPEECH.
[60] Eric R. Ziegel,et al. The Elements of Statistical Learning , 2003, Technometrics.
[61] Trevor Hastie,et al. The Elements of Statistical Learning , 2001 .
[62] H. Künzel. Effects of voice disguise on speaking fundamental frequency , 2000 .
[63] Douglas A. Reynolds,et al. Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..
[64] R. Kumaresan,et al. Model-based approach to envelope and positive instantaneous frequency estimation of signals with speech applications , 1999 .
[65] Bhaskar D. Rao,et al. All-pole modeling of speech based on the minimum variance distortionless response spectrum , 2000, Conference Record of the Thirty-First Asilomar Conference on Signals, Systems and Computers (Cat. No.97CB36136).
[66] James David Johnston,et al. Enhancing the Performance of Perceptual Audio Coders by Using Temporal Noise Shaping (TNS) , 1996 .
[67] H. Takahashi,et al. Perceived pitch of whispered vowels--relationship with formant frequencies: a preliminary study. , 1996, Journal of voice : official journal of the Voice Foundation.
[68] Douglas A. Reynolds,et al. Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..
[69] Hynek Hermansky,et al. RASTA processing of speech , 1994, IEEE Trans. Speech Audio Process..
[70] Jean-Luc Gauvain,et al. Speaker adaptation based on MAP estimation of HMM parameters , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[71] J C Junqua,et al. The Lombard reflex and its role on human listeners and automatic speech recognizers. , 1993, The Journal of the Acoustical Society of America.
[72] V. Tartter. What's in a whisper? , 1989, The Journal of the Acoustical Society of America.
[73] Yves Grenier,et al. Time-dependent ARMA modeling of nonstationary signals , 1983 .
[74] A. Willsky,et al. Time-varying parametric modeling of speech☆ , 1983 .
[75] Alan Oppenheim,et al. Time-varying parametric modeling of speech , 1977, 1977 IEEE Conference on Decision and Control including the 16th Symposium on Adaptive Processes and A Special Symposium on Fuzzy Set Theory and Applications.
[76] L. A. Liporace. Linear estimation of nonstationary signals. , 1975, The Journal of the Acoustical Society of America.
[77] J. Makhoul,et al. Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.