Deep Top-$k$ Ranking for Image–Sentence Matching
暂无分享,去创建一个
Lingling Zhang | Minnan Luo | Jun Liu | Yi Yang | Xiaojun Chang | Alexander G. Hauptmann | Xiaojun Chang | Yi Yang | A. Hauptmann | Jun Liu | Lingling Zhang | Minnan Luo
[1] Lin Ma,et al. Multimodal Convolutional Neural Networks for Matching Image and Sentence , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[2] Mathias Lux,et al. Cross Media Retrieval in Knowledge Discovery , 2004, PAKM.
[3] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Liwei Wang,et al. Learning Two-Branch Neural Networks for Image-Text Matching Tasks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[5] Lior Wolf,et al. Fisher Vectors Derived from Hybrid Gaussian-Laplacian Mixture Models for Image Annotation , 2014, ArXiv.
[6] L. Montanarella,et al. Chemometric classification of some european wines using pyrolysis mass spectrometry , 1995 .
[7] Dean P. Foster,et al. Finding Linear Structure in Large Datasets with Scalable Canonical Correlation Analysis , 2015, ICML.
[8] Marc'Aurelio Ranzato,et al. DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.
[9] Colin Fyfe,et al. Kernel and Nonlinear Canonical Correlation Analysis , 2000, IJCNN.
[10] Yueting Zhuang,et al. A low rank structural large margin method for cross-modal ranking , 2013, SIGIR.
[11] Michael K. Ng,et al. Sparse Kernel Canonical Correlation Analysis via $\ell_1$-regularization , 2017, 1701.04207.
[12] Yuxin Peng,et al. CCL: Cross-modal Correlation Learning With Multigrained Fusion by Hierarchical Network , 2017, IEEE Transactions on Multimedia.
[13] Bernt Schiele,et al. Top-k Multiclass SVM , 2015, NIPS.
[14] Shian-Shyong Tseng,et al. Knee Point Search Using Cascading Top-k Sorting with Minimized Time Complexity , 2013, TheScientificWorldJournal.
[15] Qingming Huang,et al. Cross-Modal Correlation Learning by Adaptive Hierarchical Semantic Aggregation , 2014, IEEE Transactions on Multimedia.
[16] Jason Weston,et al. WSABIE: Scaling Up to Large Vocabulary Image Annotation , 2011, IJCAI.
[17] Qinghua Zheng,et al. Avoiding Optimal Mean ℓ2,1-Norm Maximization-Based Robust PCA for Reconstruction , 2017, Neural Computation.
[18] Peter Young,et al. From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions , 2014, TACL.
[19] Zhihui Li,et al. Top-k multi-class SVM using multiple features , 2017, Inf. Sci..
[20] Yin Li,et al. Learning Deep Structure-Preserving Image-Text Embeddings , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Bernt Schiele,et al. Loss Functions for Top-k Error: Analysis and Insights , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[22] Ruslan Salakhutdinov,et al. Multimodal Neural Language Models , 2014, ICML.
[23] Wei Xu,et al. Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN) , 2014, ICLR.
[24] C. V. Jawahar,et al. Im2Text and Text2Im: Associating Images and Texts for Cross-Modal Retrieval , 2014, BMVC.
[25] Trevor Darrell,et al. Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Yongfeng Huang,et al. Twitter100k: A Real-World Dataset for Weakly Supervised Cross-Media Retrieval , 2017, IEEE Transactions on Multimedia.
[27] Jung-Woo Ha,et al. Dual Attention Networks for Multimodal Reasoning and Matching , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[28] A. Murat Tekalp,et al. Audiovisual Synchronization and Fusion Using Canonical Correlation Analysis , 2007, IEEE Transactions on Multimedia.
[29] Ruslan Salakhutdinov,et al. Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models , 2014, ArXiv.
[30] John Shawe-Taylor,et al. Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.
[31] Frank Rudzicz,et al. Adaptive Kernel Canonical Correlation Analysis for Estimation of Task Dynamics from Acoustics , 2010, ICASSP.
[32] Michael Isard,et al. A Multi-View Embedding Space for Modeling Internet Images, Tags, and Their Semantics , 2012, International Journal of Computer Vision.
[33] Aviv Eisenschtat,et al. Linking Image and Text with 2-Way Nets , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[34] Jiwen Lu,et al. Deep Coupled Metric Learning for Cross-Modal Matching , 2017, IEEE Transactions on Multimedia.
[35] Krystian Mikolajczyk,et al. Deep correlation for matching images and text , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[36] Thomas G. Dietterich,et al. Transductive Optimization of Top k Precision , 2015, IJCAI.
[37] Qinghua Zheng,et al. Simple to Complex Cross-modal Learning to Rank , 2017, Comput. Vis. Image Underst..
[38] Yueting Zhuang,et al. Deep Compositional Cross-modal Learning to Rank via Local-Global Alignment , 2015, ACM Multimedia.
[39] Dong Yu,et al. Deep Learning: Methods and Applications , 2014, Found. Trends Signal Process..
[40] Raman Arora,et al. Multi-view CCA-based acoustic features for phonetic recognition across speakers and domains , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[41] Armand Joulin,et al. Deep Fragment Embeddings for Bidirectional Image Sentence Mapping , 2014, NIPS.
[42] Quoc V. Le,et al. Grounded Compositional Semantics for Finding and Describing Images with Sentences , 2014, TACL.
[43] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[44] Yi Yang,et al. Robust Top-k Multiclass SVM for Visual Category Recognition , 2017, KDD.
[45] Qinghua Zheng,et al. Avoiding Optimal Mean Robust PCA/2DPCA with Non-greedy ℓ1-Norm Maximization , 2016, IJCAI.
[46] Martin A. Riedmiller,et al. Advanced supervised learning in multi-layer perceptrons — From backpropagation to adaptive learning algorithms , 1994 .
[47] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[48] Maya R. Gupta,et al. Training highly multiclass classifiers , 2014, J. Mach. Learn. Res..
[49] Svetlana Lazebnik,et al. Improving Image-Sentence Embeddings Using Large Weakly Annotated Photo Collections , 2014, ECCV.
[50] Jeff A. Bilmes,et al. Deep Canonical Correlation Analysis , 2013, ICML.
[51] Wei Wang,et al. Instance-Aware Image and Sentence Matching with Selective Multimodal LSTM , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[52] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[53] Qi Tian,et al. Cross-Modal Retrieval Using Multiordered Discriminative Structured Subspace Learning , 2017, IEEE Transactions on Multimedia.
[54] Peter Young,et al. Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics , 2013, J. Artif. Intell. Res..