暂无分享,去创建一个
Luc Van Gool | Dengxin Dai | Arun Balajee Vasudevan | Jiri Matas | L. Gool | Jiri Matas | Dengxin Dai | A. Vasudevan
[1] Wouter Van Gansbeke,et al. Multi-Task Learning for Dense Prediction Tasks: A Survey , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[2] Nuno Vasconcelos,et al. Self-Supervised Generation of Spatial Audio for 360 Video , 2018, NIPS 2018.
[3] Dingzeyu Li,et al. Scene-aware audio for 360° videos , 2018, ACM Trans. Graph..
[4] Gaurav Sharma,et al. Beyond Image to Depth: Improving Depth Prediction using Echoes , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[5] Marianna Obrist,et al. Spatial Soundscapes and Virtual Worlds: Challenges and Opportunities , 2020, Frontiers in Psychology.
[6] Chuang Gan,et al. Self-Supervised Moving Vehicle Tracking With Stereo Sound , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[7] Bo Dai,et al. Visually Informed Binaural Audio Generation without Binaural Audios , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[8] Antonio Torralba,et al. LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.
[9] William W. Gaver. What in the World Do We Hear? An Ecological Approach to Auditory Event Perception , 1993 .
[10] Chuang Gan,et al. Deep Audio Priors Emerge From Harmonic Convolutional Networks , 2020, ICLR.
[11] Jae S. Lim,et al. Signal estimation from modified short-time Fourier transform , 1983, ICASSP.
[12] Abhinav Valada,et al. There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Andrew Owens,et al. Audio-Visual Scene Analysis with Self-Supervised Multisensory Features , 2018, ECCV.
[14] Kristen Grauman,et al. Semantic Audio-Visual Navigation , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Benjamin Höferlin,et al. Evaluation of background subtraction techniques for video surveillance , 2011, CVPR 2011.
[16] John W. McDonough,et al. Kalman Filters for Time Delay of Arrival-Based Source Localization , 2005, EURASIP J. Adv. Signal Process..
[17] Ingmar Posner,et al. Leveraging the urban soundscape: Auditory perception for smart vehicles , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[18] Nuno Vasconcelos,et al. Robust Audio-Visual Instance Discrimination , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Bernard Ghanem,et al. Self-Supervised Learning by Cross-Modal Audio-Video Clustering , 2019, NeurIPS.
[20] K. Grauman,et al. SoundSpaces: Audio-Visual Navigation in 3D Environments , 2019, ECCV.
[21] Lawrence D. Rosenblum,et al. Echolocating Distance by Moving and Stationary Listeners , 2000 .
[22] Sidney S. Simon,et al. Merging of the Senses , 2008, Front. Neurosci..
[23] Kristen Grauman,et al. VisualEchoes: Spatial Image Representation Learning through Echolocation , 2020, ECCV.
[24] Hirokazu Kameoka,et al. Seeing through Sounds: Predicting Visual Semantic Segmentation Results from Multichannel Audio Signals , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Emanuel A. P. Habets,et al. Inference of Room Geometry From Acoustic Impulse Responses , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[26] Justin Salamon,et al. A Dataset and Taxonomy for Urban Sound Research , 2014, ACM Multimedia.
[27] Yaser Sheikh,et al. Neural Synthesis of Binaural Speech From Mono Audio , 2021, ICLR.
[28] H. Wallach,et al. The role of head movements and vestibular and visual cues in sound localization. , 1940 .
[29] Iván V. Meza,et al. Localization of sound sources in robotics: A review , 2017, Robotics Auton. Syst..
[30] Chenjie Gu,et al. DDSP: Differentiable Digital Signal Processing , 2020, ICLR.
[31] J. Blauert. Spatial Hearing: The Psychophysics of Human Sound Localization , 1983 .
[32] Andrew Zisserman,et al. Emotion Recognition in Speech using Cross-Modal Transfer in the Wild , 2018, ACM Multimedia.
[33] Jana Kosecka,et al. Joint Semantic Segmentation and Depth Estimation with Deep Convolutional Networks , 2016, 2016 Fourth International Conference on 3D Vision (3DV).
[34] Gabriel J. Brostow,et al. Digging Into Self-Supervised Monocular Depth Estimation , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[35] Dinesh Manocha,et al. 3D Reconstruction in the presence of glasses by acoustic and stereo fusion , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[36] Rogério Schmidt Feris,et al. Learning to Separate Object Sounds by Watching Unlabeled Video , 2018, ECCV.
[37] Marie-Francine Moens,et al. Talk2Car: Taking Control of Your Self-Driving Car , 2019, EMNLP.
[38] Paul Newman,et al. Listening for Sirens: Locating and Classifying Acoustic Alarms in City Scenes , 2018, IEEE Transactions on Intelligent Transportation Systems.
[39] Xiaogang Wang,et al. Sep-Stereo: Visually Guided Stereophonic Audio Generation by Associating Source Separation , 2020, ECCV.
[40] Erik Marchi,et al. Detecting Road Surface Wetness from Audio: A Deep Learning Approach , 2015, 2016 23rd International Conference on Pattern Recognition (ICPR).
[41] Justin Salamon,et al. Telling Left From Right: Learning Spatial Correspondence of Sight and Sound , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[42] Paul Hurley,et al. DeepWave: A Recurrent Neural-Network for Real-Time Acoustic Imaging , 2019, NeurIPS.
[43] Luc Van Gool,et al. ACDC: The Adverse Conditions Dataset with Correspondences for Semantic Driving Scene Understanding , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[44] Andreas Geiger,et al. Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..
[45] Kristen Grauman,et al. Co-Separating Sounds of Visual Objects , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[46] Antonio Torralba,et al. SoundNet: Learning Sound Representations from Unlabeled Video , 2016, NIPS.
[47] Santhosh K. Ramakrishnan,et al. Learning to Set Waypoints for Audio-Visual Navigation , 2020, ICLR.
[48] Luc Van Gool,et al. The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.
[49] Ashutosh Saxena,et al. Learning sound location from a single microphone , 2009, 2009 IEEE International Conference on Robotics and Automation.
[50] Yongqin Xian,et al. Distilling Audio-Visual Knowledge by Compositional Contrastive Learning , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[51] Jong Wook Kim,et al. Crepe: A Convolutional Representation for Pitch Estimation , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[52] George Papandreou,et al. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.
[53] Andrew Owens,et al. Ambient Sound Provides Supervision for Visual Learning , 2016, ECCV.
[54] Stella X. Yu,et al. BatVision: Learning to See 3D Spatial Layout with Two Ears , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).
[55] Andrew Zisserman,et al. Look, Listen and Learn , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[56] Julian F. P. Kooij,et al. Hearing What You Cannot See: Acoustic Vehicle Detection Around Corners , 2021, IEEE Robotics and Automation Letters.
[57] Russell L. Martin,et al. Sound localization with head movement: implications for 3-d audio displays , 2014, Front. Neurosci..
[58] Chenliang Xu,et al. Audio-Visual Event Localization in Unconstrained Videos , 2018, ECCV.
[59] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[60] Adrian Hilton,et al. 3D Room Geometry Reconstruction Using Audio-Visual Sensors , 2017, 2017 International Conference on 3D Vision (3DV).
[61] Martin Vetterli,et al. Acoustic echoes reveal room shape , 2013, Proceedings of the National Academy of Sciences.
[62] Andrew Owens,et al. Visually Indicated Sounds , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[63] Federico Domínguez,et al. SoundCompass: A Distributed MEMS Microphone Array-Based Sensor for Sound Source Localization , 2014, Sensors.
[64] Weidong Huang,et al. Human Factors in Augmented Reality Environments , 2012, Springer New York.
[65] Tae-Hyun Oh,et al. Learning to Localize Sound Source in Visual Scenes , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[66] Bruno Fazenda,et al. Acoustic based safety emergency vehicle detection for intelligent transport systems , 2009, 2009 ICCAS-SICE.
[67] Wolfram Burgard,et al. Self-Supervised Visual Terrain Classification From Unsupervised Acoustic Feature Learning , 2019, IEEE Transactions on Robotics.
[68] Chuang Gan,et al. The Sound of Motions , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[69] Sebastian Ramos,et al. The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[70] Durand R. Begault,et al. 3-D Sound for Virtual Reality and Multimedia Cambridge , 1994 .
[71] Iasonas Kokkinos,et al. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[72] Philippe Souères,et al. A survey on sound source localization in robotics: From binaural to array processing methods , 2015, Comput. Speech Lang..
[73] Yossi Yovel,et al. A fully autonomous terrestrial bat-like acoustic robot , 2018, PLoS Comput. Biol..
[74] Luc Van Gool,et al. Object Referring in Visual Scene with Spoken Language , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).
[75] Andrew Zisserman,et al. Objects that Sound , 2017, ECCV.
[76] Yoav Y. Schechner,et al. Harmony in Motion , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.
[77] Look, Listen, and Act: Towards Audio-Visual Embodied Navigation , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).
[78] Thomas Brox,et al. FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[79] Yi Li,et al. Learning Representations from Audio-Visual Spatial Alignment , 2020, NeurIPS.
[80] Chuang Gan,et al. The Sound of Pixels , 2018, ECCV.