暂无分享,去创建一个
[1] David Applebaum,et al. Probability and Information: An Integrated Approach , 2008 .
[2] Tara N. Sainath,et al. Learning the speech front-end with raw waveform CLDNNs , 2015, INTERSPEECH.
[3] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[4] George Trigeorgis,et al. Adieu features? End-to-end speech emotion recognition using a deep convolutional recurrent network , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Titouan Parcollet,et al. The Pytorch-kaldi Speech Recognition Toolkit , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Hermann Ney,et al. Acoustic modeling with deep neural networks using raw time signal for LVCSR , 2014, INTERSPEECH.
[7] Yoshua Bengio,et al. Mutual Information Neural Estimation , 2018, ICML.
[8] Douglas A. Reynolds,et al. Deep Neural Network Approaches to Speaker and Language Recognition , 2015, IEEE Signal Processing Letters.
[9] Andrew L. Maas. Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .
[10] Bhuvana Ramabhadran,et al. Invariant Representations for Noisy Speech Recognition , 2016, ArXiv.
[11] Dimitri Palaz,et al. Analysis of CNN-based speech recognition system using raw speech as input , 2015, INTERSPEECH.
[12] Yun Lei,et al. Advances in deep neural network approaches to speaker recognition , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[13] Yoshua Bengio,et al. A network of deep neural networks for Distant Speech Recognition , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[15] Liam Paninski,et al. Estimation of Entropy and Mutual Information , 2003, Neural Computation.
[16] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] James Philbin,et al. FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Sébastien Marcel,et al. Towards Directly Modeling Raw Speech Signal for Speaker Verification Using CNNS , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Lalit R. Bahl,et al. Maximum mutual information estimation of hidden Markov model parameters for speech recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[20] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[21] Yoshua Bengio,et al. Speaker Recognition from Raw Waveform with SincNet , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[22] J. Kinney,et al. Equitability, mutual information, and the maximal information coefficient , 2013, Proceedings of the National Academy of Sciences.
[23] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.
[24] Maurizio Omologo,et al. Contaminated speech training methods for robust DNN-HMM distant speech recognition , 2017, INTERSPEECH.
[25] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[26] Maurizio Omologo,et al. The DIRHA-ENGLISH corpus and related tasks for distant-speech recognition in domestic environments , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[27] Xiao Liu,et al. Deep Speaker: an End-to-End Neural Speaker Embedding System , 2017, ArXiv.
[28] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.
[29] Yoshua Bengio,et al. Interpretable Convolutional Filters with SincNet , 2018, ArXiv.
[30] Dong Yu,et al. Automatic Speech Recognition: A Deep Learning Approach , 2014 .
[31] Maurizio Omologo,et al. Impulse response estimation for robust speech recognition in a reverberant environment , 2012, 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO).
[32] Erik McDermott,et al. Deep neural networks for small footprint text-dependent speaker verification , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[33] Yoshua Bengio,et al. Learning deep representations by mutual information estimation and maximization , 2018, ICLR.
[34] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[35] Mirco Ravanelli,et al. Deep Learning for Distant Speech Recognition , 2017, ArXiv.
[36] Jean-Marc Odobez,et al. Robust and Discriminative Speaker Embedding via Intra-Class Distance Variance Regularization , 2018, INTERSPEECH.
[37] Pietro Liò,et al. Deep Graph Infomax , 2018, ICLR.
[38] Jonathan G. Fiscus,et al. Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .
[39] Driss Matrouf,et al. Study of the Effect of I-vector Modeling on Short and Mismatch Utterance Duration for Speaker Verification , 2012, INTERSPEECH.
[40] Yoshua Bengio,et al. Learning Independent Features with Adversarial Nets for Non-linear ICA , 2017, 1710.05050.
[41] Yoshua Bengio,et al. Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.
[42] Panayiotis G. Georgiou,et al. Neural Predictive Coding Using Convolutional Neural Networks Toward Unsupervised Learning of Speaker Characteristics , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[43] Joon Son Chung,et al. VoxCeleb: A Large-Scale Speaker Identification Dataset , 2017, INTERSPEECH.
[44] Yoshua Bengio,et al. Batch-normalized joint training for DNN-based distant speech recognition , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).