暂无分享,去创建一个
[1] Patrick Kenny,et al. Joint Factor Analysis Versus Eigenchannels in Speaker Recognition , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[2] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[3] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[4] Patrick Kenny,et al. Maximum likelihood estimation of eigenvoices and residual variances for large vocabulary speech recognition tasks , 2002, INTERSPEECH.
[5] Yoshua Bengio,et al. NICE: Non-linear Independent Components Estimation , 2014, ICLR.
[6] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[7] Sean A. Fulop. Speech Spectrum Analysis , 2011 .
[8] Douglas A. Reynolds,et al. Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..
[9] Dong Wang,et al. Deep Factorization for Speech Signal , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Lantian Li,et al. Deep Normalization for Speaker Vectors , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[11] Hiroya Fujisaki,et al. Prosody, Models, and Spontaneous Speech , 1997, Computing Prosody.
[12] Patrick Kenny,et al. Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[13] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .
[14] Gunnar Fant,et al. Acoustic Theory Of Speech Production , 1960 .
[15] Hao Tang,et al. An Unsupervised Autoregressive Model for Speech Representation Learning , 2019, INTERSPEECH.
[16] Yu Zhang,et al. Learning Latent Representations for Speech Generation and Transformation , 2017, INTERSPEECH.
[17] Yuxuan Wang,et al. Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis , 2018, ICML.
[18] Eric Nalisnick,et al. Normalizing Flows for Probabilistic Modeling and Inference , 2019, J. Mach. Learn. Res..
[19] Marwan Al-Akaidi,et al. Introduction to speech processing , 2004 .
[20] J. Flanagan. Speech Analysis, Synthesis and Perception , 1971 .
[21] Prafulla Dhariwal,et al. Glow: Generative Flow with Invertible 1x1 Convolutions , 2018, NeurIPS.
[22] Samy Bengio,et al. Density estimation using Real NVP , 2016, ICLR.