Rule-Embedded Network for Audio-Visual Voice Activity Detection in Live Musical Video Streams
暂无分享,去创建一个
[1] C. Ballantine. On the Hadamard product , 1968 .
[2] Mark D. Plumbley,et al. Sound Event Detection with Sequentially Labelled Data Based on Connectionist Temporal Classification and Unsupervised Clustering , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Daniel P. W. Ellis,et al. Locating singing voice segments within music signals , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).
[4] Lie Lu,et al. Automated extraction of music snippets , 2003, ACM Multimedia.
[5] Israel Cohen,et al. Audio-Visual Voice Activity Detection Using Diffusion Maps , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[6] Haizhou Li,et al. An overview of text-independent speaker recognition: From features to supervectors , 2010, Speech Commun..
[7] Tetsuo Kosaka,et al. Improving Voice Activity Detection for Multimodal Movie Dialogue Corpus , 2018, 2018 IEEE 7th Global Conference on Consumer Electronics (GCCE).
[8] Annamaria Mesaros,et al. Metrics for Polyphonic Sound Event Detection , 2016 .
[9] George Saon,et al. Analyzing convolutional neural networks for speech activity detection in mismatched acoustic conditions , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[11] Tae-Hyun Oh,et al. Learning to Localize Sound Source in Visual Scenes , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[12] Charles R. Johnson,et al. Topics in matrix analysis: The Hadamard product , 1991 .
[13] Norihiro Hagita,et al. Real-time audio-visual voice activity detection for speech recognition in noisy environments , 2010, AVSP.
[14] Emmanuel Vincent,et al. Sound Event Detection in the DCASE 2017 Challenge , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[15] Juhan Nam,et al. Revisiting Singing Voice Detection: A quantitative review and the future outlook , 2018, ISMIR.
[16] Fathi M. Salem,et al. Gate-variants of Gated Recurrent Unit (GRU) neural networks , 2017, 2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS).
[17] Yann Dauphin,et al. Language Modeling with Gated Convolutional Networks , 2016, ICML.
[18] Carlos Busso,et al. End-to-end Audiovisual Speech Activity Detection with Bimodal Recurrent Neural Models , 2018, Speech Commun..
[19] Liu Peng,et al. Audio-visual voice activity detection , 2006 .
[20] Hiroshi G. Okuno,et al. A Speaker Diarization System with Robust Speaker Localization and Voice Activity Detection , 2013 .
[21] Jianwu Dang,et al. Phase aware deep neural network for noise robust voice activity detection , 2017, 2017 IEEE International Conference on Multimedia and Expo (ICME).
[22] Jian Luan,et al. Transfer Learning for Improving Singing-voice Detection in Polyphonic Instrumental Music , 2020, INTERSPEECH.
[23] Jean-Luc Gauvain,et al. Optimization of RNN-Based Speech Activity Detection , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.