Semantic Object Prediction and Spatial Sound Super-Resolution with Binaural Sounds
暂无分享,去创建一个
[1] Nuno Vasconcelos,et al. Self-Supervised Generation of Spatial Audio for 360 Video , 2018, NIPS 2018.
[2] Dingzeyu Li,et al. Scene-aware audio for 360° videos , 2018, ACM Trans. Graph..
[3] Chuang Gan,et al. Self-Supervised Moving Vehicle Tracking With Stereo Sound , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[4] Antonio Torralba,et al. LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.
[5] William W. Gaver. What in the World Do We Hear? An Ecological Approach to Auditory Event Perception , 1993 .
[6] Jae S. Lim,et al. Signal estimation from modified short-time Fourier transform , 1983, ICASSP.
[7] Andrew Owens,et al. Audio-Visual Scene Analysis with Self-Supervised Multisensory Features , 2018, ECCV.
[8] Benjamin Höferlin,et al. Evaluation of background subtraction techniques for video surveillance , 2011, CVPR 2011.
[9] John W. McDonough,et al. Kalman Filters for Time Delay of Arrival-Based Source Localization , 2005, EURASIP J. Adv. Signal Process..
[10] Ingmar Posner,et al. Leveraging the urban soundscape: Auditory perception for smart vehicles , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[11] Lawrence D. Rosenblum,et al. Echolocating Distance by Moving and Stationary Listeners , 2000 .
[12] Hirokazu Kameoka,et al. Seeing through Sounds: Predicting Visual Semantic Segmentation Results from Multichannel Audio Signals , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[13] Emanuel A. P. Habets,et al. Inference of Room Geometry From Acoustic Impulse Responses , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[14] Justin Salamon,et al. A Dataset and Taxonomy for Urban Sound Research , 2014, ACM Multimedia.
[15] H. Wallach,et al. The role of head movements and vestibular and visual cues in sound localization. , 1940 .
[16] Iván V. Meza,et al. Localization of sound sources in robotics: A review , 2017, Robotics Auton. Syst..
[17] Andrew Zisserman,et al. Emotion Recognition in Speech using Cross-Modal Transfer in the Wild , 2018, ACM Multimedia.
[18] Guy J. Brown,et al. Computational auditory scene analysis , 1994, Comput. Speech Lang..
[19] Gabriel J. Brostow,et al. Digging Into Self-Supervised Monocular Depth Estimation , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[20] Dinesh Manocha,et al. 3D Reconstruction in the presence of glasses by acoustic and stereo fusion , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Rogério Schmidt Feris,et al. Learning to Separate Object Sounds by Watching Unlabeled Video , 2018, ECCV.
[22] Marie-Francine Moens,et al. Talk2Car: Taking Control of Your Self-Driving Car , 2019, EMNLP.
[23] Roland Siegwart,et al. The current state and future outlook of rescue robotics , 2019, J. Field Robotics.
[24] H. Bülthoff,et al. Merging the senses into a robust percept , 2004, Trends in Cognitive Sciences.
[25] W R Thurlow,et al. Head movements during sound localization. , 1967, The Journal of the Acoustical Society of America.
[26] Luc Van Gool,et al. Revisiting Multi-Task Learning in the Deep Learning Era , 2020, ArXiv.
[27] Paul Hurley,et al. DeepWave: A Recurrent Neural-Network for Real-Time Acoustic Imaging , 2019, NeurIPS.
[28] Andreas Geiger,et al. Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..
[29] Kristen Grauman,et al. Co-Separating Sounds of Visual Objects , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[30] Antonio Torralba,et al. SoundNet: Learning Sound Representations from Unlabeled Video , 2016, NIPS.
[31] Ashutosh Saxena,et al. Learning sound location from a single microphone , 2009, 2009 IEEE International Conference on Robotics and Automation.
[32] George Papandreou,et al. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.
[33] Andrew Owens,et al. Ambient Sound Provides Supervision for Visual Learning , 2016, ECCV.
[34] Luc Van Gool,et al. End-to-End Learning of Driving Models with Surround-View Cameras and Route Planners , 2018, ECCV.
[35] Andrew Zisserman,et al. Look, Listen and Learn , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[36] Russell L. Martin,et al. Sound localization with head movement: implications for 3-d audio displays , 2014, Front. Neurosci..
[37] Chenliang Xu,et al. Audio-Visual Event Localization in Unconstrained Videos , 2018, ECCV.
[38] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[39] Adrian Hilton,et al. 3D Room Geometry Reconstruction Using Audio-Visual Sensors , 2017, 2017 International Conference on 3D Vision (3DV).
[40] Martin Vetterli,et al. Acoustic echoes reveal room shape , 2013, Proceedings of the National Academy of Sciences.
[41] Andrew Owens,et al. Visually Indicated Sounds , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[42] Robert Fendrich,et al. The Merging of the Senses , 1993, Journal of Cognitive Neuroscience.
[43] Federico Domínguez,et al. SoundCompass: A Distributed MEMS Microphone Array-Based Sensor for Sound Source Localization , 2014, Sensors.
[44] Alan L. Yuille,et al. Towards unified depth and semantic prediction from a single image , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[45] Weidong Huang,et al. Human Factors in Augmented Reality Environments , 2012, Springer New York.
[46] Tae-Hyun Oh,et al. Learning to Localize Sound Source in Visual Scenes , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[47] Bruno Fazenda,et al. Acoustic based safety emergency vehicle detection for intelligent transport systems , 2009, 2009 ICCAS-SICE.
[48] Chuang Gan,et al. The Sound of Motions , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[49] Sebastian Ramos,et al. The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[50] Iasonas Kokkinos,et al. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[51] Philippe Souères,et al. A survey on sound source localization in robotics: From binaural to array processing methods , 2015, Comput. Speech Lang..
[52] Alejandro Cartas,et al. Seeing and Hearing Egocentric Actions: How Much Can We Learn? , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).
[53] Luc Van Gool,et al. Object Referring in Visual Scene with Spoken Language , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).
[54] Andrew Zisserman,et al. Objects that Sound , 2017, ECCV.
[55] Yoav Y. Schechner,et al. Harmony in Motion , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.
[56] Chuang Gan,et al. The Sound of Pixels , 2018, ECCV.