暂无分享,去创建一个
[1] Yoshua Bengio,et al. Object Recognition with Gradient-Based Learning , 1999, Shape, Contour and Grouping in Computer Vision.
[2] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[3] Ramón Fernández Astudillo,et al. The DIRHA-GRID corpus: baseline and tools for multi-room distant speech recognition using distributed microphones , 2014, INTERSPEECH.
[4] Jun Guo,et al. DNN Filter Bank Cepstral Coefficients for Spoofing Detection , 2017, IEEE Access.
[5] Seiichi Nakagawa,et al. A deep neural network integrated with filterbank learning for speech recognition , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Michael Elad,et al. Convolutional Neural Networks Analyzed via Convolutional Sparse Coding , 2016, J. Mach. Learn. Res..
[7] Quanshi Zhang,et al. Interpreting CNN knowledge via an Explanatory Graph , 2017, AAAI.
[8] Vishal Passricha,et al. End-to-End Acoustic Modeling Using Convolutional Neural Networks , 2019 .
[9] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[10] Mirco Ravanelli,et al. Deep Learning for Distant Speech Recognition , 2017, ArXiv.
[11] Maurizio Omologo,et al. Contaminated speech training methods for robust DNN-HMM distant speech recognition , 2017, INTERSPEECH.
[12] Yoshua Bengio,et al. Light Gated Recurrent Units for Speech Recognition , 2018, IEEE Transactions on Emerging Topics in Computational Intelligence.
[13] Ronald W. Schafer,et al. Theory and Applications of Digital Speech Processing , 2010 .
[14] Klaus-Robert Müller,et al. Interpreting and Explaining Deep Neural Networks for Classification of Audio Signals , 2018, ArXiv.
[15] Yoshua Bengio,et al. Twin Regularization for online speech recognition , 2018, INTERSPEECH.
[16] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[17] Navdeep Jaitly,et al. Towards End-To-End Speech Recognition with Recurrent Neural Networks , 2014, ICML.
[18] Hye-jin Shim,et al. A Complete End-to-End Speaker Verification System Using Deep Neural Networks: From Raw Signals to Verification Result , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Driss Matrouf,et al. Study of the Effect of I-vector Modeling on Short and Mismatch Utterance Duration for Speaker Verification , 2012, INTERSPEECH.
[20] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.
[21] George Trigeorgis,et al. Adieu features? End-to-end speech emotion recognition using a deep convolutional recurrent network , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[22] Yoshua Bengio,et al. A network of deep neural networks for Distant Speech Recognition , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[23] Tara N. Sainath,et al. Speaker location and microphone spacing invariant acoustic modeling from raw multichannel waveforms , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[24] Sébastien Marcel,et al. Towards Directly Modeling Raw Speech Signal for Speaker Verification Using CNNS , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Sanjeev Khudanpur,et al. Acoustic Modelling from the Signal Domain Using CNNs , 2016, INTERSPEECH.
[26] John H. L. Hansen,et al. Text-Independent Speaker Verification Based on Triplet Convolutional Neural Network Embeddings , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[27] Janet M. Baker,et al. The Design for the Wall Street Journal-based CSR Corpus , 1992, HLT.
[28] Yoshua Bengio,et al. SampleRNN: An Unconditional End-to-End Neural Audio Generation Model , 2016, ICLR.
[29] Andrew L. Maas. Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .
[30] Maurizio Omologo,et al. The DIRHA-ENGLISH corpus and related tasks for distant-speech recognition in domestic environments , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[31] Kong-Aik Lee,et al. An extensible speaker identification sidekit in Python , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[32] Titouan Parcollet,et al. The Pytorch-kaldi Speech Recognition Toolkit , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[33] Hermann Ney,et al. Acoustic modeling with deep neural networks using raw time signal for LVCSR , 2014, INTERSPEECH.
[34] Sanjeev Khudanpur,et al. Deep Neural Network Embeddings for Text-Independent Speaker Verification , 2017, INTERSPEECH.
[35] Yoshua Bengio,et al. Speaker Recognition from Raw Waveform with SincNet , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[36] Sébastien Marcel,et al. On Learning Vocal Tract System Related Speaker Discriminative Information from Raw Signal Using CNNs , 2018, INTERSPEECH.
[37] Kai Yu,et al. End-to-end spoofing detection with raw waveform CLDNNS , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[38] Hye-jin Shim,et al. Avoiding Speaker Overfitting in End-to-End DNNs Using Raw Waveform for Text-Independent Speaker Verification , 2018, INTERSPEECH.
[39] Carlos Guestrin,et al. "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.
[40] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[41] Yoshua Bengio,et al. Batch-normalized joint training for DNN-based distant speech recognition , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).
[42] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.
[43] Stéphane Mallat,et al. Understanding deep convolutional networks , 2016, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.
[44] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.
[45] Geoffrey E. Hinton,et al. Dynamic Routing Between Capsules , 2017, NIPS.
[46] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[47] Quanshi Zhang,et al. Visual interpretability for deep learning: a survey , 2018, Frontiers of Information Technology & Electronic Engineering.
[48] Maurizio Omologo,et al. A multi-channel corpus for distant-speech interaction in presence of known interferences , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[49] Dimitri Palaz,et al. Analysis of CNN-based speech recognition system using raw speech as input , 2015, INTERSPEECH.
[50] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[51] Douglas A. Reynolds,et al. A unified deep neural network for speaker and language recognition , 2015, INTERSPEECH.
[52] Sridha Sridharan,et al. i-vector Based Speaker Recognition on Short Utterances , 2011, INTERSPEECH.
[53] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[54] Dimitri Palaz,et al. End-to-end acoustic modeling using convolutional neural networks for HMM-based automatic speech recognition , 2019, Speech Commun..
[55] Stephen A. Dyer,et al. Digital signal processing , 2018, 8th International Multitopic Conference, 2004. Proceedings of INMIC 2004..
[56] Alun D. Preece,et al. Interpretability of deep learning models: A survey of results , 2017, 2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI).
[57] Dong Yu,et al. Automatic Speech Recognition: A Deep Learning Approach , 2014 .
[58] Tara N. Sainath,et al. Learning the speech front-end with raw waveform CLDNNs , 2015, INTERSPEECH.
[59] Yoshua Bengio,et al. Improving Speech Recognition by Revising Gated Recurrent Units , 2017, INTERSPEECH.
[60] Maurizio Omologo,et al. Realistic Multi-Microphone Data Simulation for Distant Speech Recognition , 2016, INTERSPEECH.
[61] Petros Maragos,et al. The DIRHA simulated corpus , 2014, LREC.
[62] Yoshua Bengio,et al. End-to-end attention-based large vocabulary speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[63] Patrick Kenny,et al. Deep Speaker Embeddings for Short-Duration Speaker Verification , 2017, INTERSPEECH.
[64] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[65] Iasonas Kokkinos,et al. Learning Filterbanks from Raw Speech for Phone Recognition , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[66] Tara N. Sainath,et al. Learning filter banks within a deep neural network framework , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[67] Ron J. Weiss,et al. Speech acoustic modeling from raw multichannel waveforms , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[68] Maurizio Omologo,et al. On the selection of the impulse responses for distant-speech recognition based on contaminated speech training , 2014, INTERSPEECH.
[69] Joon Son Chung,et al. VoxCeleb: A Large-Scale Speaker Identification Dataset , 2017, INTERSPEECH.
[70] Shrikanth S. Narayanan,et al. Modified-prior i-vector estimation for language identification of short duration utterances , 2014, INTERSPEECH.
[71] Maurizio Omologo,et al. Impulse response estimation for robust speech recognition in a reverberant environment , 2012, 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO).
[72] Erik McDermott,et al. Deep neural networks for small footprint text-dependent speaker verification , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).