Low-Variance Multitaper MFCC Features: A Case Study in Robust Speaker Verification
暂无分享,去创建一个
Haizhou Li | Kong-Aik Lee | Tomi Kinnunen | Rahim Saeidi | Maria Hansson | Johan Sandberg | Filip Sedlak | T. Kinnunen | Kong-Aik Lee | Haizhou Li | Filip Sedlak | M. Hansson | R. Saeidi | J. Sandberg
[1] Rahim Saeidi,et al. Particle Swarm Optimization for Sorted Adapted Gaussian Mixture Models , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[2] Haizhou Li,et al. An overview of text-independent speaker recognition: From features to supervectors , 2010, Speech Commun..
[3] Kurt S. Riedel,et al. Minimum bias multiple taper spectral estimation , 2018, IEEE Trans. Signal Process..
[4] D.J. Thomson,et al. Jackknifing Multitaper Spectrum Estimates , 2007, IEEE Signal Processing Magazine.
[5] Alvin F. Martin,et al. The DET curve in assessment of detection task performance , 1997, EUROSPEECH.
[6] Haizhou Li,et al. Temporal Structure Normalization of Speech Feature for Robust Speech Recognition , 2007, IEEE Signal Processing Letters.
[7] Alex Acero,et al. Spoken Language Processing: A Guide to Theory, Algorithm and System Development , 2001 .
[8] D. Thomson,et al. Spectrum estimation and harmonic analysis , 1982, Proceedings of the IEEE.
[9] David A. van Leeuwen,et al. NIST and NFI-TNO evaluations of automatic speaker recognition , 2006, Comput. Speech Lang..
[10] J. Makhoul,et al. Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.
[11] Donald B. Percival,et al. The variance of multitaper spectrum estimates for real Gaussian processes , 1994, IEEE Trans. Signal Process..
[12] Douglas A. Reynolds,et al. Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..
[13] Tomi Kinnunen,et al. Multitaper Estimation of Frequency-Warped Cepstra With Application to Speaker Verification , 2010, IEEE Signal Processing Letters.
[14] Patrick Kenny,et al. Joint Factor Analysis of Speaker and Session Variability: Theory and Algorithms , 2006 .
[15] Sridha Sridharan,et al. Feature warping for robust speaker verification , 2001, Odyssey.
[16] Maria Hansson,et al. Optimal cepstrum estimation using multiple windows , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[17] Douglas D. O'Shaughnessy,et al. Multi-taper MFCC features for speaker verification using I-vectors , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.
[18] Paavo Alku,et al. Temporally Weighted Linear Prediction Features for Speaker Verification in Additive Noise , 2010, Odyssey.
[19] Rong Tong,et al. The I4U system in NIST 2008 speaker recognition evaluation , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[20] Douglas E. Sturim,et al. Support vector machines using GMM supervectors for speaker verification , 2006, IEEE Signal Processing Letters.
[21] Donald B. Percival,et al. Spectral Analysis for Physical Applications , 1993 .
[22] Shrikanth S. Narayanan,et al. Robust Voice Activity Detection Using Long-Term Signal Variability , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[23] Patrick Kenny,et al. Speaker and Session Variability in GMM-Based Speaker Verification , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[24] Maria Hansson,et al. A multiple window method for estimation of peaked spectra , 1997, IEEE Trans. Signal Process..
[25] Tomi Kinnunen,et al. What else is new than the hamming window? robust MFCCs for speaker recognition via multitapering , 2010, INTERSPEECH.
[26] Hynek Hermansky,et al. RASTA processing of speech , 1994, IEEE Trans. Speech Audio Process..
[27] F. Harris. On the use of windows for harmonic analysis with the discrete Fourier transform , 1978, Proceedings of the IEEE.
[28] Paavo Alku,et al. Temporally Weighted Linear Prediction Features for Tackling Additive Noise in Speaker Verification , 2010, IEEE Signal Processing Letters.
[29] G. Schwarz. Estimating the Dimension of a Model , 1978 .
[30] Sridha Sridharan,et al. Modelling session variability in text-independent speaker verification , 2005, INTERSPEECH.
[31] Andreas Stolcke,et al. Speaker Recognition With Session Variability Normalization Based on MLLR Adaptation Transforms , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[32] Patrick Kenny,et al. Joint Factor Analysis Versus Eigenchannels in Speaker Recognition , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[33] J. G. Woodward,et al. IEEE TRANSACTIONS@ ON AUDIO AND ELECTROACOUSTICS , 1968 .
[34] Paavo Alku,et al. Extended weighted linear prediction (XLP) analysis of speech and its application to speaker verification in adverse conditions , 2010, INTERSPEECH.
[35] William M. Campbell,et al. Advances in channel compensation for SVM speaker recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..
[36] Patrick Kenny,et al. Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[37] Roland Auckenthaler,et al. Score Normalization for Text-Independent Speaker Verification Systems , 2000, Digit. Signal Process..
[38] Rainer Martin,et al. On the Statistics of Spectral Amplitudes After Variance Reduction by Temporal Cepstrum Smoothing and Cepstral Nulling , 2009, IEEE Transactions on Signal Processing.
[39] David A. van Leeuwen,et al. Fusion of Heterogeneous Speaker Recognition Systems in the STBU Submission for the NIST Speaker Recognition Evaluation 2006 , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[40] Stan Davis,et al. Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .
[41] Gordon Ramsay,et al. Multitaper analysis of fundamental frequency variations during voiced fricatives , 2003 .
[42] L. P. Ricotti. Multitapering and a wavelet variant of MFCC in speech recognition , 2005 .
[43] P. Welch. The use of fast Fourier transform for the estimation of power spectra: A method based on time averaging over short, modified periodograms , 1967 .
[44] Sven Nordholm,et al. Statistical Voice Activity Detection Using Low-Variance Spectrum Estimation and an Adaptive Threshold , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[45] Haizhou Li,et al. GMM-SVM Kernel With a Bhattacharyya-Based Distance for Speaker Recognition , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[46] N. Erdol,et al. Multitaper Covariance Estimation and Spectral Denoising , 2005, Conference Record of the Thirty-Ninth Asilomar Conference onSignals, Systems and Computers, 2005..
[47] Jeff A. Bilmes,et al. MVA Processing of Speech Features , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[48] Patrick Kenny,et al. A Study of Interspeaker Variability in Speaker Verification , 2008, IEEE Transactions on Audio, Speech, and Language Processing.
[49] Philipos C. Loizou,et al. Speech Enhancement: Theory and Practice , 2007 .
[50] Yi Hu,et al. Speech enhancement based on wavelet thresholding the multitaper spectrum , 2004, IEEE Transactions on Speech and Audio Processing.
[51] Arnold Neumaier,et al. Algorithm 808: ARfit—a matlab package for the estimation of parameters and eigenmodes of multivariate autoregressive models , 2001, TOMS.
[52] Thomas P. Bronez,et al. On the performance advantage of multitaper spectral analysis , 1992, IEEE Trans. Signal Process..