The Boombox: Visual Reconstruction from Acoustic Vibrations
暂无分享,去创建一个
[1] Jung-Woo Ha,et al. StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[2] Harshad Rai,et al. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks , 2018 .
[3] Antonio Torralba,et al. SoundNet: Learning Sound Representations from Unlabeled Video , 2016, NIPS.
[4] François Chaumette,et al. Visual Servoing and Visual Tracking , 2008, Springer Handbook of Robotics.
[5] Jiajun Wu,et al. Generative Modeling of Audible Shapes for Object Perception , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[6] Pieter Abbeel,et al. BigBIRD: A large-scale 3D database of object instances , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).
[7] Austin Reiter,et al. Audiovisual Zooming: What You See Is What You Hear , 2019, ACM Multimedia.
[8] Dinesh Manocha,et al. Reflection-Aware Sound Source Localization , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[9] Ian Taylor,et al. Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[10] Andrew Owens,et al. Ambient Sound Provides Supervision for Visual Learning , 2016, ECCV.
[11] Chen Fang,et al. Visual to Sound: Generating Natural Sound for Videos in the Wild , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[12] Lisa Feigenson,et al. Tracking individuals via object-files: evidence from infants' manual search , 2003 .
[13] Augusto Sarti,et al. TDOA-based acoustic source localization in the space–range reference frame , 2014, Multidimens. Syst. Signal Process..
[14] Deva Ramanan,et al. TAO: A Large-Scale Benchmark for Tracking Any Object , 2020, ECCV.
[15] Jiajun Wu,et al. Shape and Material from Sound , 2017, NIPS.
[16] Vincent Lepetit,et al. Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes , 2011, 2011 International Conference on Computer Vision.
[17] Antonio Torralba,et al. Through-Wall Human Pose Estimation Using Radio Signals , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[18] R. Baillargeon,et al. 2.5-Month-Old Infants' Reasoning about When Objects Should and Should Not Be Occluded , 1999, Cognitive Psychology.
[19] L. Rayleigh,et al. XII. On our perception of sound direction , 1907 .
[20] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[21] Alexei A. Efros,et al. Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[22] Matthias Nießner,et al. 3DMatch: Learning the Matching of Local 3D Geometry in Range Scans , 2016, ArXiv.
[23] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[24] Qinghua Huang,et al. Gaussian filter for TDOA based sound source localization in multimedia surveillance , 2018, Multimedia Tools and Applications.
[25] Tae-Hyun Oh,et al. Speech2Face: Learning the Face Behind a Voice , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .
[27] Andrew Owens,et al. Visually Indicated Sounds , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[28] Paul J. Besl,et al. A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..
[29] R. Brooks. Planning Collision- Free Motions for Pick-and-Place Operations , 1983 .
[30] Yashraj S. Narang,et al. STReSSD: Sim-To-Real from Sound for Stochastic Dynamics , 2020, CoRL.
[31] Thomas Brox,et al. FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[32] Frédo Durand,et al. Turning Corners into Cameras: Principles and Methods , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[33] Yann LeCun,et al. Deep multi-scale video prediction beyond mean square error , 2015, ICLR.
[34] Paul J. Besl,et al. Method for registration of 3-D shapes , 1992, Other Conferences.
[35] Martin Vetterli,et al. Acoustic echoes reveal room shape , 2013, Proceedings of the National Academy of Sciences.
[36] G. Carter,et al. The generalized correlation method for estimation of time delay , 1976 .
[37] E. Malis. Survey of vision-based robot control , 2002 .
[38] Xudong Ma,et al. Robust tracking of moving sound source using multiple model Kalman filter , 2008 .
[39] Hod Lipson,et al. Visual behavior modelling for robotic theory of mind , 2021, Scientific reports.
[40] Dhiraj Gandhi,et al. Swoosh! Rattle! Thump! - Actions that Sound , 2020, Robotics: Science and Systems.
[41] Xiaogang Wang,et al. Vision-Infused Deep Audio Inpainting , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[42] Dima Damen,et al. EPIC-Fusion: Audio-Visual Temporal Binding for Egocentric Action Recognition , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[43] Felix Heide,et al. Steady-State Non-Line-Of-Sight Imaging , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[44] Sertac Karaman,et al. Sparse-to-Dense: Depth Prediction from Sparse Depth Samples and a Single Image , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[45] Sergey Levine,et al. QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation , 2018, CoRL.
[46] Shun-Po Chuang,et al. Towards Audio to Scene Image Synthesis Using Generative Adversarial Network , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[47] Andrew Zisserman,et al. Look, Listen and Learn , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[48] Simon Lucey,et al. Argoverse: 3D Tracking and Forecasting With Rich Maps , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[49] Gordon Wetzstein,et al. Acoustic Non-Line-Of-Sight Imaging , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[50] Fanjun Bu,et al. Object Permanence Through Audio-Visual Representations , 2021, IEEE Access.
[51] J. C. Middlebrooks. Sound localization. , 2015, Handbook of clinical neurology.
[52] Sergey Levine,et al. Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection , 2016, Int. J. Robotics Res..