Multimodal Recurrent Neural Networks With Information Transfer Layers for Indoor Scene Labeling
暂无分享,去创建一个
Gang Wang | Lap-Pui Chau | Bing Shuai | Abrar H. Abdulnabi | Zhen Zuo | G. Wang | Lap-Pui Chau | Bing Shuai | Zhen Zuo
[1] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[2] Yong Jae Lee,et al. Discovering important people and objects for egocentric video summarization , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[3] Trevor Darrell,et al. Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[4] Miguel Á. Carreira-Perpiñán,et al. Multiscale conditional random fields for image labeling , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..
[5] Gang Wang,et al. Multi-modal Unsupervised Feature Learning for RGB-D Scene Labeling , 2014, ECCV.
[6] Luc Van Gool,et al. Depth and Appearance for Mobile Scene Analysis , 2007, 2007 IEEE 11th International Conference on Computer Vision.
[7] Deva Ramanan,et al. Detecting activities of daily living in first-person camera views , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[8] Robert Ho. Canonical Correlation Analysis , 2013 .
[9] Zhi-Hua Zhou,et al. A New Analysis of Co-Training , 2010, ICML.
[10] Yann LeCun,et al. Indoor Semantic Segmentation using depth information , 2013, ICLR.
[11] Derek Hoiem,et al. Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.
[12] Dieter Fox,et al. RGB-(D) scene labeling: Features and algorithms , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[13] Bastian Leibe,et al. Dense 3D semantic mapping of indoor scenes from RGB-D images , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).
[14] Neil Martin Robertson,et al. Deep Head Pose: Gaze-Direction Estimation in Multimodal Video , 2015, IEEE Transactions on Multimedia.
[15] Razvan Pascanu,et al. On the difficulty of training recurrent neural networks , 2012, ICML.
[16] H. Hotelling. Relations Between Two Sets of Variates , 1936 .
[17] Gang Wang,et al. Quaddirectional 2D-Recurrent Neural Networks For Image Labeling , 2015, IEEE Signal Processing Letters.
[18] Yunchao Wei,et al. Perceptual Generative Adversarial Networks for Small Object Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Dariu Gavrila,et al. Multi-cue Pedestrian Detection and Tracking from a Moving Vehicle , 2007, International Journal of Computer Vision.
[20] Songcan Chen,et al. MultiK-MHKS: A Novel Multiple Kernel Learning Algorithm , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[21] Honglak Lee,et al. Deep learning for detecting robotic grasps , 2013, Int. J. Robotics Res..
[22] Gang Wang,et al. Multi-Task CNN Model for Attribute Prediction , 2015, IEEE Transactions on Multimedia.
[23] Jörg Stückler,et al. Dense real-time mapping of object-class semantics from RGB-D video , 2013, Journal of Real-Time Image Processing.
[24] Jürgen Schmidhuber,et al. Framewise phoneme classification with bidirectional LSTM and other neural network architectures , 2005, Neural Networks.
[25] Luc Van Gool,et al. Dynamic 3D Scene Analysis from a Moving Vehicle , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.
[26] Gang Wang,et al. Episodic CAMN: Contextual Attention-Based Memory Networks with Iterative Feedback for Scene Labeling , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Gang Wang,et al. Convolutional recurrent neural networks: Learning spatial dependencies for image representation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[28] Dariu Gavrila,et al. Real-time dense stereo for intelligent vehicles , 2006, IEEE Transactions on Intelligent Transportation Systems.
[29] Gang Wang,et al. Beyond Forward Shortcuts: Fully Convolutional Master-Slave Networks (MSNets) with Backward Skip Connections for Semantic Segmentation , 2017, ArXiv.
[30] Yoshua Bengio,et al. Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .
[31] Rob Fergus,et al. Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[32] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[33] Jürgen Schmidhuber,et al. Multi-dimensional Recurrent Neural Networks , 2007, ICANN.
[34] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[35] Fuchun Sun,et al. Unsupervised multimodal feature learning for semantic image segmentation , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).
[36] Jitendra Malik,et al. Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[37] S. Munder,et al. Pedestrian recognition using combined low-resolution depth and intensity images , 2008, 2008 IEEE Intelligent Vehicles Symposium.
[38] T. Munich,et al. Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks , 2008, NIPS.
[39] César Cadena,et al. Semantic Parsing for Priming Object Detection in RGB-D Scenes , 2013 .
[40] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[41] Yann LeCun,et al. Scene parsing with Multiscale Feature Learning, Purity Trees, and Optimal Covers , 2012, ICML.
[42] Qiuping Xu. Canonical correlation Analysis , 2014 .
[43] Jeffrey L. Elman,et al. Finding Structure in Time , 1990, Cogn. Sci..
[44] Vladlen Koltun,et al. Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.
[45] Nitish Srivastava,et al. Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..
[46] Michael R. Lyu,et al. A Multimodal and Multilevel Ranking Scheme for Large-Scale Video Retrieval , 2008, IEEE Transactions on Multimedia.
[47] Dacheng Tao,et al. Robust Face Recognition via Multimodal Deep Face Representation , 2015, IEEE Transactions on Multimedia.
[48] Tao Mei,et al. Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-Grained Image Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[49] Jürgen Schmidhuber,et al. Bidirectional LSTM Networks for Improved Phoneme Classification and Recognition , 2005, ICANN.
[50] Jitendra Malik,et al. Cross Modal Distillation for Supervision Transfer , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[51] Jitendra Malik,et al. Indoor Scene Understanding with RGB-D Images: Bottom-up Segmentation, Object Detection and Semantic Segmentation , 2015, International Journal of Computer Vision.
[52] Eric P. Xing,et al. Tree-Guided Group Lasso for Multi-Task Regression with Structured Sparsity , 2009, ICML.
[53] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[54] Zhibin Hong,et al. Tracking via Robust Multi-task Multi-view Joint Sparse Representation , 2013, 2013 IEEE International Conference on Computer Vision.
[55] Ronan Collobert,et al. Recurrent Convolutional Neural Networks for Scene Labeling , 2014, ICML.
[56] Juhan Nam,et al. Multimodal Deep Learning , 2011, ICML.
[57] Nathan Silberman,et al. Indoor scene segmentation using a structured light sensor , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).
[58] Honglak Lee,et al. Improved Multimodal Deep Learning with Variation of Information , 2014, NIPS.
[59] Marcus Liwicki,et al. Scene labeling with LSTM recurrent neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[60] James M. Keller,et al. Histogram of Oriented Normal Vectors for Object Recognition with a Depth Sensor , 2012, ACCV.
[61] Wenwu Zhu,et al. Learning Compact Hash Codes for Multimodal Representations Using Orthogonal Deep Structure , 2015, IEEE Transactions on Multimedia.
[62] Dariu Gavrila,et al. High-Level Fusion of Depth and Intensity for Pedestrian Classification , 2009, DAGM-Symposium.
[63] Sheng Tang,et al. Accurate Estimation of Human Body Orientation From RGB-D Sensors , 2013, IEEE Transactions on Cybernetics.
[64] Cordelia Schmid,et al. Human Detection Using Oriented Histograms of Flow and Appearance , 2006, ECCV.
[65] Jeff A. Bilmes,et al. Deep Canonical Correlation Analysis , 2013, ICML.
[66] Dacheng Tao,et al. A Survey on Multi-view Learning , 2013, ArXiv.
[67] Claire Cardie,et al. Opinion Mining with Deep Recurrent Neural Networks , 2014, EMNLP.
[68] Sven Behnke,et al. Learning depth-sensitive conditional random fields for semantic segmentation of RGB-D images , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).
[69] Jana Kosecka,et al. Semantic parsing for priming object detection in indoors RGB-D scenes , 2015, Int. J. Robotics Res..
[70] Honglak Lee,et al. Sparse deep belief net model for visual area V2 , 2007, NIPS.
[71] Mohammed Bennamoun,et al. Geometry Driven Semantic Labeling of Indoor Scenes , 2014, ECCV.
[72] Razvan Pascanu,et al. Advances in optimizing recurrent networks , 2012, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.