Learning to Localize Sound Source in Visual Scenes
暂无分享,去创建一个
Tae-Hyun Oh | Jun-Sik Kim | In-So Kweon | Ming-Hsuan Yang | Arda Senocak | Ming-Hsuan Yang | Junsik Kim | In-So Kweon | Tae-Hyun Oh | Arda Senocak
[1] Benjamin Schrauwen,et al. Deep content-based music recommendation , 2013, NIPS.
[2] Michael Elad,et al. Pixels that sound , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).
[3] Yoav Y. Schechner,et al. Harmony in Motion , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.
[4] Marc'Aurelio Ranzato,et al. DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.
[5] Andrew Owens,et al. Ambient Sound Provides Supervision for Visual Learning , 2016, ECCV.
[6] Antonio Torralba,et al. SoundNet: Learning Sound Representations from Unlabeled Video , 2016, NIPS.
[7] Kristen Grauman,et al. Learning Image Representations Tied to Ego-Motion , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[8] Jean Charles Bazin,et al. Suggesting Sounds for Images from Video Collections , 2016, ECCV Workshops.
[9] Shai Ben-David,et al. Understanding Machine Learning: From Theory to Algorithms , 2014 .
[10] Alessio Del Bue,et al. Seeing the Sound: A New Multimodal Imaging Device for Computer Vision , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).
[11] Antonio Torralba,et al. See, Hear, and Read: Deep Aligned Representations , 2017, ArXiv.
[12] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.
[13] Luc Van Gool,et al. The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.
[14] Harry L. Van Trees,et al. Optimum Array Processing: Part IV of Detection, Estimation, and Modulation Theory , 2002 .
[15] B. Skinner. Superstition in the pigeon. , 1948, Journal of experimental psychology.
[16] Javier R. Movellan,et al. Audio Vision: Using Audio-Visual Synchrony to Locate Sounds , 1999, NIPS.
[17] Andrew Owens,et al. Visually Indicated Sounds , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[18] David A. Shamma,et al. YFCC100M , 2015, Commun. ACM.
[19] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[20] Robert S. Bolia,et al. Aurally Aided Visual Search in Three-Dimensional Space , 1999, Hum. Factors.
[21] Christopher Kanan,et al. Visual question answering: Datasets, algorithms, and future challenges , 2016, Comput. Vis. Image Underst..
[22] M. Corbetta,et al. Control of goal-directed and stimulus-driven attention in the brain , 2002, Nature Reviews Neuroscience.
[23] Mubarak Shah,et al. Multimodal Analysis for Identification and Segmentation of Moving-Sounding Objects , 2013, IEEE Transactions on Multimedia.
[24] William W. Gaver. What in the World Do We Hear? An Ecological Approach to Auditory Event Perception , 1993 .
[25] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[26] B. R. Shelton,et al. The influence of vision on the absolute identification of sound-source position , 1980, Perception & psychophysics.
[27] Richard L. McKinley,et al. Aurally Aided Visual Search under Virtual and Free-Field Listening Conditions , 1996, Hum. Factors.
[28] Nir Ailon,et al. Deep Metric Learning Using Triplet Network , 2014, SIMBAD.
[29] Trevor Darrell,et al. Learning Joint Statistical Models for Audio-Visual Fusion and Segregation , 2000, NIPS.
[30] B. Kabanoff,et al. Eye movements in auditory space perception , 1975 .
[31] Piotr Majdak,et al. 3-D localization of virtual sound sources: Effects of visual environment, pointing method, and training , 2010, Attention, perception & psychophysics.
[32] Bolei Zhou,et al. Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[33] Andrew Zisserman,et al. Look, Listen and Learn , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[34] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.