Two-stage deep learning for supervised cross-modal retrieval
暂无分享,去创建一个
[1] Fei Su,et al. 3View deep canonical correlation analysis for cross-modal retrieval , 2015, 2015 Visual Communications and Image Processing (VCIP).
[2] Yuxin Peng,et al. Cross-modal deep metric learning with multi-task regularization , 2017, 2017 IEEE International Conference on Multimedia and Expo (ICME).
[3] Jeff A. Bilmes,et al. Deep Canonical Correlation Analysis , 2013, ICML.
[4] Yao Zhao,et al. Modality-Dependent Cross-Media Retrieval , 2015, ACM Trans. Intell. Syst. Technol..
[5] Michael Isard,et al. A Multi-View Embedding Space for Modeling Internet Images, Tags, and Their Semantics , 2012, International Journal of Computer Vision.
[6] Andrew Y. Ng,et al. Zero-Shot Learning Through Cross-Modal Transfer , 2013, NIPS.
[7] Weizhi Nie,et al. Cross-domain semantic transfer from large-scale social media , 2014, Multimedia Systems.
[8] Roger Levy,et al. A new approach to cross-modal multimedia retrieval , 2010, ACM Multimedia.
[9] Nicu Sebe,et al. Optimized Graph Learning Using Partial Tags and Multiple Features for Image and Video Annotation , 2016, IEEE Transactions on Image Processing.
[10] Marc'Aurelio Ranzato,et al. DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.
[11] Yueting Zhuang,et al. Cross-media semantic representation via bi-directional learning to rank , 2013, ACM Multimedia.
[12] Paul Smolensky,et al. Information processing in dynamical systems: foundations of harmony theory , 1986 .
[13] Shengcai Liao,et al. Cross-Modal Similarity Learning: A Low Rank Bilinear Formulation , 2014, CIKM.
[14] Juhan Nam,et al. Multimodal Deep Learning , 2011, ICML.
[15] Xuelong Li,et al. From Deterministic to Generative: Multimodal Stochastic RNNs for Video Captioning , 2017, IEEE Transactions on Neural Networks and Learning Systems.
[16] Samy Bengio,et al. A Discriminative Kernel-Based Approach to Rank Images from Text Queries , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[17] Xiaoshuai Sun,et al. Two-Stream 3-D convNet Fusion for Action Recognition in Videos With Arbitrary Size and Length , 2018, IEEE Transactions on Multimedia.
[18] Daoqiang Zhang,et al. Canonical sparse cross-view correlation analysis , 2016, Neurocomputing.
[19] Tat-Seng Chua,et al. NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.
[20] Xiaogang Wang,et al. Deep Learning Face Representation by Joint Identification-Verification , 2014, NIPS.
[21] Shiliang Sun,et al. A survey of multi-view machine learning , 2013, Neural Computing and Applications.
[22] Yann LeCun,et al. Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).
[23] Nitish Srivastava,et al. Learning Representations for Multimodal Data with Deep Belief Nets , 2012 .
[24] Shiliang Sun,et al. Active learning with extremely sparse labeled examples , 2010, Neurocomputing.
[25] Geoffrey E. Hinton,et al. Exponential Family Harmoniums with an Application to Information Retrieval , 2004, NIPS.
[26] Ruifan Li,et al. Cross-modal Retrieval with Correspondence Autoencoder , 2014, ACM Multimedia.
[27] Geoffrey E. Hinton,et al. Replicated Softmax: an Undirected Topic Model , 2009, NIPS.
[28] Ruifan Li,et al. Deep correspondence restricted Boltzmann machine for cross-modal retrieval , 2015, Neurocomputing.
[29] Ishwar K. Sethi,et al. Multimedia content processing through cross-modal association , 2003, MULTIMEDIA '03.
[30] Jason Weston,et al. Large scale image annotation: learning to rank with joint word-image embeddings , 2010, Machine Learning.
[31] David W. Jacobs,et al. Generalized Multiview Analysis: A discriminative latent space , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[32] Zi Huang,et al. Inter-media hashing for large-scale retrieval from heterogeneous data sources , 2013, SIGMOD '13.
[33] Roman Rosipal,et al. Overview and Recent Advances in Partial Least Squares , 2005, SLSFS.
[34] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.
[35] Zan Gao,et al. Multi-view discriminative and structured dictionary learning with group sparsity for human action recognition , 2015, Signal Process..
[36] Deyu Wang,et al. Group-Pair Convolutional Neural Networks for Multi-View Based 3D Object Retrieval , 2018, AAAI.
[37] Heng Tao Shen,et al. Beyond Frame-level CNN: Saliency-Aware 3-D CNN With LSTM for Video Action Recognition , 2017, IEEE Signal Processing Letters.
[38] Jianjun Wang,et al. Kernel canonical correlation analysis via gradient descent , 2016, Neurocomputing.
[39] Yuxin Peng,et al. Cross-Media Shared Representation by Hierarchical Learning with Multiple Deep Networks , 2016, IJCAI.
[40] John Shawe-Taylor,et al. Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.
[41] Fei Su,et al. Deep canonical correlation analysis with progressive and hypergraph learning for cross-modal retrieval , 2016, Neurocomputing.
[42] Xuelong Li,et al. Graph PCA Hashing for Similarity Search , 2017, IEEE Transactions on Multimedia.
[43] Yu Qiao,et al. A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.
[44] Roger Levy,et al. On the Role of Correlation and Abstraction in Cross-Modal Multimedia Retrieval , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[45] Joshua B. Tenenbaum,et al. Separating Style and Content with Bilinear Models , 2000, Neural Computation.
[46] Meng Wang,et al. Self-Supervised Video Hashing With Hierarchical Binary Auto-Encoder , 2018, IEEE Transactions on Image Processing.
[47] Xiaohua Zhai,et al. Learning Cross-Media Joint Representation With Sparse and Semisupervised Regularization , 2014, IEEE Transactions on Circuits and Systems for Video Technology.