Cross-Modal Retrieval via Deep and Bidirectional Representation Learning
暂无分享,去创建一个
Jian Wang | Shiming Xiang | Chunhong Pan | Cuicui Kang | Yonghao He | Shiming Xiang | Chunhong Pan | Yonghao He | Cuicui Kang | Jian Wang
[1] Phil Blunsom,et al. A Convolutional Neural Network for Modelling Sentences , 2014, ACL.
[2] Andrew Y. Ng,et al. Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.
[3] Xiaohua Zhai,et al. Effective Heterogeneous Similarity Measure with Nearest Neighbors for Cross-Media Retrieval , 2012, MMM.
[4] Yueting Zhuang,et al. A low rank structural large margin method for cross-modal ranking , 2013, SIGIR.
[5] Quoc V. Le,et al. Grounded Compositional Semantics for Finding and Describing Images with Sentences , 2014, TACL.
[6] David W. Jacobs,et al. Generalized Multiview Analysis: A discriminative latent space , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[7] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[8] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[9] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[10] Antonio Torralba,et al. Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.
[11] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.
[12] Matthijs C. Dorst. Distinctive Image Features from Scale-Invariant Keypoints , 2011 .
[13] Yansheng Lu,et al. A semantic model for cross-modal and multi-modal retrieval , 2013, ICMR '13.
[14] Ruifan Li,et al. Cross-modal Retrieval with Correspondence Autoencoder , 2014, ACM Multimedia.
[15] Beng Chin Ooi,et al. Effective Multi-Modal Retrieval based on Stacked Auto-Encoders , 2014, Proc. VLDB Endow..
[16] Michael Isard,et al. A Multi-View Embedding Space for Modeling Internet Images, Tags, and Their Semantics , 2012, International Journal of Computer Vision.
[17] Andrea Vedaldi,et al. MatConvNet: Convolutional Neural Networks for MATLAB , 2014, ACM Multimedia.
[18] B. S. Manjunath,et al. Color and texture descriptors , 2001, IEEE Trans. Circuits Syst. Video Technol..
[19] Yueting Zhuang,et al. Cross-media semantic representation via bi-directional learning to rank , 2013, ACM Multimedia.
[20] Geoffrey E. Hinton,et al. Generating Text with Recurrent Neural Networks , 2011, ICML.
[21] Juhan Nam,et al. Multimodal Deep Learning , 2011, ICML.
[22] Michael I. Jordan,et al. Modeling annotated data , 2003, SIGIR.
[23] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.
[24] Wei Wang,et al. Learning Coupled Feature Spaces for Cross-Modal Matching , 2013, 2013 IEEE International Conference on Computer Vision.
[25] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.
[26] Yi Yang,et al. Mining Semantic Correlation of Heterogeneous Multimedia Data for Cross-Media Retrieval , 2008, IEEE Transactions on Multimedia.
[27] Hang Li,et al. Convolutional Neural Network Architectures for Matching Natural Language Sentences , 2014, NIPS.
[28] Cícero Nogueira dos Santos,et al. Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts , 2014, COLING.
[29] Changsheng Xu,et al. Learning Consistent Feature Representation for Cross-Modal Multimedia Retrieval , 2015, IEEE Transactions on Multimedia.
[30] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.
[31] Samy Bengio,et al. Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[32] R. Mukundan,et al. Moment Functions in Image Analysis: Theory and Applications , 1998 .
[33] Roman Rosipal,et al. Overview and Recent Advances in Partial Least Squares , 2005, SLSFS.
[34] Beng Chin Ooi,et al. Effective deep learning-based multi-modal retrieval , 2015, The VLDB Journal.
[35] Roger Levy,et al. A new approach to cross-modal multimedia retrieval , 2010, ACM Multimedia.
[36] Larry P. Heck,et al. Learning deep structured semantic models for web search using clickthrough data , 2013, CIKM.
[37] Armand Joulin,et al. Deep Fragment Embeddings for Bidirectional Image Sentence Mapping , 2014, NIPS.
[38] Geoffrey E. Hinton,et al. Deep Boltzmann Machines , 2009, AISTATS.
[39] John Shawe-Taylor,et al. Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.
[40] Xiaogang Wang,et al. Deep Learning Face Representation by Joint Identification-Verification , 2014, NIPS.
[41] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[42] Peter Young,et al. Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics , 2013, J. Artif. Intell. Res..
[43] Paul Clough,et al. The IAPR TC-12 Benchmark: A New Evaluation Resource for Visual Information Systems , 2006 .
[44] Xiang Zhang,et al. Text Understanding from Scratch , 2015, ArXiv.
[45] Nitish Srivastava,et al. Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..
[46] Peter Young,et al. From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions , 2014, TACL.
[47] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..
[48] D. Jacobs,et al. Bypassing synthesis: PLS for face recognition with pose, low-resolution and sketch , 2011, CVPR 2011.
[49] Hang Li,et al. A Deep Architecture for Matching Short Texts , 2013, NIPS.
[50] Trevor Darrell,et al. Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[51] Zhou Yu,et al. Discriminative coupled dictionary hashing for fast cross-media retrieval , 2014, SIGIR.
[52] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[53] Xiaohua Zhai,et al. Cross-modality correlation propagation for cross-media retrieval , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[54] Tong Zhang,et al. Effective Use of Word Order for Text Categorization with Convolutional Neural Networks , 2014, NAACL.
[55] Geoffrey E. Hinton,et al. Replicated Softmax: an Undirected Topic Model , 2009, NIPS.
[56] Trevor Darrell,et al. Learning cross-modality similarity for multinomial data , 2011, 2011 International Conference on Computer Vision.
[57] Stefan Carlsson,et al. CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.
[58] Joshua B. Tenenbaum,et al. Separating Style and Content with Bilinear Models , 2000, Neural Computation.
[59] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[60] Jonathan Tompson,et al. Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation , 2014, NIPS.