Optimization of RNN-Based Speech Activity Detection
暂无分享,去创建一个
[1] Xiao Fu,et al. Quantum Behaved Particle Swarm Optimization with Neighborhood Search for Numerical Optimization , 2013 .
[2] Peder A. Olsen,et al. Voicing features for robust speech detection , 2005, INTERSPEECH.
[3] Nozomu Hamada,et al. Noise robust Voice Activity Detection for multiple speakers , 2010, 2010 International Symposium on Intelligent Signal Processing and Communication Systems.
[4] Gregory Gelly,et al. Neural Networks as a Guidance Solution for Soft-Landing and Aerocapture , 2009 .
[5] Sridha Sridharan,et al. Noise robust voice activity detection using features extracted from the time-domain autocorrelation function , 2010, INTERSPEECH.
[6] Paul J. Werbos,et al. Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.
[7] Yusuke Kida,et al. Voice Activity Detection: Merging Source and Filter-based Information , 2016, IEEE Signal Processing Letters.
[8] Jeffrey L. Elman,et al. Finding Structure in Time , 1990, Cogn. Sci..
[9] Aaron E. Rosenberg,et al. An improved endpoint detector for isolated word recognition , 1981 .
[10] Björn W. Schuller,et al. Real-life voice activity detection with LSTM Recurrent Neural Networks and an application to Hollywood movies , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[11] Kai Yu,et al. A comparative study of robustness of deep learning approaches for VAD , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Fei Xie,et al. A comparative study of speech detection methods , 1997, EUROSPEECH.
[13] Jean-Luc Gauvain,et al. Developing STT and KWS systems using limited language resources , 2014, INTERSPEECH.
[14] Brian Kingsbury,et al. Improvements to the IBM speech activity detection system for the DARPA RATS program , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Izhak Shafran,et al. Robust speech detection and segmentation for real-time ASR applications , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[16] Maurice Clerc,et al. The particle swarm - explosion, stability, and convergence in a multidimensional complex space , 2002, IEEE Trans. Evol. Comput..
[17] Russell C. Eberhart,et al. A new optimizer using particle swarm theory , 1995, MHS'95. Proceedings of the Sixth International Symposium on Micro Machine and Human Science.
[18] Petros Maragos,et al. Speech event detection using multiband modulation energy , 2005, INTERSPEECH.
[19] Jean-Luc Gauvain,et al. Minimum word error training of RNN-based voice activity detection , 2015, INTERSPEECH.
[20] Shrikanth S. Narayanan,et al. Robust Voice Activity Detection Using Long-Term Signal Variability , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[21] Razvan Pascanu,et al. On the difficulty of training recurrent neural networks , 2012, ICML.
[22] Jürgen Schmidhuber,et al. Learning Precise Timing with LSTM Recurrent Networks , 2003, J. Mach. Learn. Res..
[23] Alex Graves,et al. Supervised Sequence Labelling , 2012 .
[24] Ilya Sutskever,et al. Learning Recurrent Neural Networks with Hessian-Free Optimization , 2011, ICML.
[25] Mark Liberman,et al. Speech activity detection on youtube using deep neural networks , 2013, INTERSPEECH.
[26] Chau Khoa. Pham. Noise robust voice activity detection , 2013 .
[27] Surya Ganguli,et al. An adaptive low dimensional quasi-Newton sum of functions optimizer , 2013, ArXiv.
[28] Wenbo Xu,et al. Particle swarm optimization with particles having quantum behavior , 2004, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753).
[29] Spyridon Matsoukas,et al. Developing a Speech Activity Detection System for the DARPA RATS Program , 2012, INTERSPEECH.
[30] Olivier Galibert,et al. A presentation of the REPERE challenge , 2012, 2012 10th International Workshop on Content-Based Multimedia Indexing (CBMI).
[31] Paul Gay. Segmentation et identification audiovisuelle de personnes dans des journaux télévisés. (Audiovisual segmentation and identification of persons in broadcast news) , 2015 .
[32] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[33] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[34] Xiaojun Wu,et al. Convergence analysis and improvements of quantum-behaved particle swarm optimization , 2012, Inf. Sci..
[35] Javier Ramírez,et al. Efficient voice activity detection algorithms using long-term speech information , 2004, Speech Commun..
[36] Kuldip K. Paliwal,et al. Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..
[37] Razvan Pascanu,et al. Advances in optimizing recurrent networks , 2012, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[38] Martin A. Riedmiller,et al. A direct adaptive method for faster backpropagation learning: the RPROP algorithm , 1993, IEEE International Conference on Neural Networks.
[39] Xiao-Lei Zhang,et al. Deep Belief Networks Based Voice Activity Detection , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[40] Alex Graves,et al. Supervised Sequence Labelling with Recurrent Neural Networks , 2012, Studies in Computational Intelligence.
[41] Giovanni Soda,et al. Exploiting the past and the future in protein secondary structure prediction , 1999, Bioinform..
[42] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[43] Stan Davis,et al. Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .
[44] Thad Hughes,et al. Recurrent neural networks for voice activity detection , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.