Are You Speaking: Real-Time Speech Activity Detection via Landmark Pooling Network
暂无分享,去创建一个
[1] Hugo Van hamme,et al. Who's Speaking?: Audio-Supervised Classification of Active Speakers in Video , 2015, ICMI.
[2] Samer Al Moubayed,et al. Towards speaker detection using lips movements for human-machine multiparty dialogue , 2012 .
[3] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[4] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[5] Andrew Zisserman,et al. Taking the bite out of automated naming of characters in TV video , 2009, Image Vis. Comput..
[6] Rajesh M. Hegde,et al. Active Speaker Detection using audio-visual sensor array , 2014, 2014 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT).
[7] Joon Son Chung,et al. Lip Reading in the Wild , 2016, ACCV.
[8] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[9] Andrew L. Maas. Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .
[10] Hugo Van hamme,et al. Active speaker detection with audio-visual co-training , 2016, ICMI.
[11] Larry S. Davis,et al. Look who's talking: speaker detection using video and audio correlation , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).
[12] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.
[13] Joon Son Chung,et al. VoxCeleb: A Large-Scale Speaker Identification Dataset , 2017, INTERSPEECH.
[14] Ishwar K. Sethi,et al. Cross-Modal Analysis of Audio-Visual Programs for Speaker Detection , 2005, 2005 IEEE 7th Workshop on Multimedia Signal Processing.