Detection of Audio-Video Synchronization Errors Via Event Detection
暂无分享,去创建一个
Sriram Sethuraman | Joshua Peter Ebenezer | Zongyi Liu | Joshua P. Ebenezer | Yongjun Wu | Hai Wei | Z. Liu | S. Sethuraman | Yongjun Wu | Hai Wei
[1] Lorenzo Torresani,et al. Cooperative Learning of Audio and Video Models from Self-Supervised Synchronization , 2018, NeurIPS.
[2] Dong Liu,et al. Deep High-Resolution Representation Learning for Human Pose Estimation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[3] Yann LeCun,et al. A Closer Look at Spatiotemporal Convolutions for Action Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[4] Dima Damen,et al. EPIC-Fusion: Audio-Visual Temporal Binding for Egocentric Action Recognition , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[5] Tara N. Sainath,et al. Deep Learning for Audio Signal Processing , 2019, IEEE Journal of Selected Topics in Signal Processing.
[6] Naji Khosravan,et al. On Attention Modules for Audio-Visual Synchronization , 2018, CVPR Workshops.
[7] Lorenzo Torresani,et al. Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[8] Jing Wang,et al. A Study on the Factors Affecting Audio-Video Subjective Experience in Virtual Reality Environments , 2017, 2017 International Conference on Virtual Reality and Visualization (ICVRV).
[9] Sainath Adapa,et al. Urban Sound Tagging using Convolutional Neural Networks , 2019, DCASE.
[10] Kamalesh Palanisamy,et al. Rethinking CNN Models for Audio Classification , 2020, ArXiv.
[11] Hervé Glotin,et al. Automatic acoustic detection of birds through deep learning: The first Bird Audio Detection challenge , 2018, Methods in Ecology and Evolution.
[12] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[13] Tuomas Virtanen,et al. Convolutional recurrent neural networks for bird audio detection , 2017, 2017 25th European Signal Processing Conference (EUSIPCO).
[14] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Mubarak Shah,et al. UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.
[16] Walter Daems,et al. Low-cost synchronization of high-speed audio and video recordings in bio-acoustic experiments , 2018, Journal of Experimental Biology.
[17] Grzegorz Gwardys,et al. Deep Image Features in Music Information Retrieval , 2014 .
[18] Joon Son Chung,et al. Out of Time: Automated Lip Sync in the Wild , 2016, ACCV Workshops.
[19] Andreas Dengel,et al. ESResNet: Environmental Sound Classification Based on Visual Domain Models , 2020, 2020 25th International Conference on Pattern Recognition (ICPR).
[20] Joon Son Chung,et al. Deep Audio-Visual Speech Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.