Audiovisual saliency prediction via deep learning
暂无分享,去创建一个
Hefei Ling | Jiazhong Chen | Ping Duan | Qingqing Li | Dakai Ren | Qingqing Li | H. Ling | Jiazhong Chen | Dakai Ren | Ping Duan
[1] D. Whitaker,et al. Sensory uncertainty governs the extent of audio-visual interaction , 2004, Vision Research.
[2] Faheem Khan,et al. Speaker separation using visually-derived binary masks , 2013, AVSP.
[3] Laurent Itti,et al. Automatic foveation for video compression using a neurobiological model of visual attention , 2004, IEEE Transactions on Image Processing.
[4] Liqiang Nie,et al. Neural Multimodal Cooperative Learning Toward Micro-Video Understanding , 2020, IEEE Transactions on Image Processing.
[5] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[6] A. Coutrot,et al. An efficient audiovisual saliency model to predict eye positions when looking at conversations , 2015, 2015 23rd European Signal Processing Conference (EUSIPCO).
[7] Pietro Perona,et al. Graph-Based Visual Saliency , 2006, NIPS.
[8] R. Venkatesh Babu,et al. DeepFix: A Fully Convolutional Neural Network for Predicting Human Eye Fixations , 2015, IEEE Transactions on Image Processing.
[9] Antoine Coutrot,et al. An audiovisual attention model for natural conversation scenes , 2014, 2014 IEEE International Conference on Image Processing (ICIP).
[10] Frédo Durand,et al. Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.
[11] Antoine Coutrot,et al. Toward the introduction of auditory information in dynamic visual attention models , 2013, 2013 14th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS).
[12] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[13] Cristian Sminchisescu,et al. Actions in the Eye: Dynamic Gaze Datasets and Learnt Saliency Models for Visual Recognition , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[14] Jianbing Shen,et al. Deep Visual Attention Prediction. , 2018, IEEE transactions on image processing : a publication of the IEEE Signal Processing Society.
[15] Zhuowen Tu,et al. Deeply-Supervised Nets , 2014, AISTATS.
[16] Nicolas Riche,et al. Audio-visual attention: Eye-tracking dataset and analysis toolbox , 2017, 2017 IEEE International Conference on Image Processing (ICIP).
[17] A. King,et al. The superior colliculus , 2004, Current Biology.
[18] Vaibhava Goel,et al. Deep multimodal learning for Audio-Visual Speech Recognition , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Dorothea Kolossa,et al. Audiovisual speech recognition with missing or unreliable data , 2009, AVSP.
[20] Aykut Erdem,et al. Spatio-Temporal Saliency Networks for Dynamic Saliency Prediction , 2016, IEEE Transactions on Multimedia.
[21] S Ullman,et al. Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.
[22] Hugo Larochelle,et al. Recurrent Mixture Density Network for Spatiotemporal Visual Attention , 2016, ICLR.
[23] Aggelos K. Katsaggelos,et al. Efficient Video Object Segmentation via Network Modulation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[24] Yoav Y. Schechner,et al. Harmony in Motion , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.
[25] Mubarak Shah,et al. Action MACH a spatio-temporal Maximum Average Correlation Height filter for action recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[26] A. Coutrot,et al. How saliency, faces, and sound influence gaze in dynamic social scenes. , 2014, Journal of vision.
[27] Petros Maragos,et al. Towards a behaviorally-validated computational audiovisual saliency model , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[28] Andrew Owens,et al. Audio-Visual Scene Analysis with Self-Supervised Multisensory Features , 2018, ECCV.
[29] Ali Borji,et al. Revisiting Video Saliency: A Large-Scale Benchmark and a New Model , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[30] Naila Murray,et al. End-to-End Saliency Mapping via Probability Distribution Prediction , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[31] Antón García-Díaz,et al. Saliency from hierarchical adaptation through decorrelation and variance normalization , 2012, Image Vis. Comput..
[32] A. Treisman,et al. A feature-integration theory of attention , 1980, Cognitive Psychology.
[33] Christof Koch,et al. Image Signature: Highlighting Sparse Salient Regions , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[34] Nicolas Riche,et al. RARE2012: A multi-scale rarity-based saliency detection with its comparative statistical analysis , 2013, Signal Process. Image Commun..
[35] Rong Li,et al. Attention region detection based on closure prior in layered bit Planes , 2017, Neurocomputing.
[36] Jongwook Choi,et al. Supervising Neural Attention Models for Video Captioning by Human Gaze Data , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[37] Tianming Liu,et al. Predicting eye fixations using convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[38] Liming Zhang,et al. A Novel Multiresolution Spatiotemporal Saliency Detection Model and Its Applications in Image and Video Compression , 2010, IEEE Transactions on Image Processing.
[39] Michael Elad,et al. Pixels that sound , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).
[40] Zhou Wang,et al. Video saliency incorporating spatiotemporal cues and uncertainty weighting , 2013, 2013 IEEE International Conference on Multimedia and Expo (ICME).
[41] Christof Koch,et al. A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .
[42] Duan-Yu Chen,et al. Preserving Motion-Tolerant Contextual Visual Saliency for Video Resizing , 2013, IEEE Transactions on Multimedia.
[43] Chuang Gan,et al. The Sound of Pixels , 2018, ECCV.
[44] Ali Borji,et al. Exploiting local and global patch rarities for saliency detection , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[45] Michael Dorr,et al. Large-Scale Optimization of Hierarchical Features for Saliency Prediction in Natural Images , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[46] Andrew Zisserman,et al. Objects that Sound , 2017, ECCV.
[47] Zhou Wang,et al. Foveation scalable video coding with automatic fixation selection , 2003, IEEE Trans. Image Process..
[48] Frédéric Berthommier,et al. A phonetically neutral model of the low-level audio-visual interaction , 2004, Speech Commun..