Multi-Modal Deep Analysis for Multimedia
暂无分享,去创建一个
Xin Wang | Wenwu Zhu | Hongzhi Li | Wenwu Zhu | Xin Wang | Hongzhi Li
[1] Fei Wang,et al. Ieee Transactions on Knowledge and Data Engineering, Manuscropt Id 1 Social Recommendation with Cross-domain Transferable Knowledge , 2022 .
[2] Lin Wu,et al. Unsupervised Metric Fusion Over Multiview Data by Graph Random Walk-Based Cross-View Diffusion , 2017, IEEE Transactions on Neural Networks and Learning Systems.
[3] Jonathan Masci,et al. Geometric Deep Learning on Graphs and Manifolds Using Mixture Model CNNs , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Fuzhen Zhuang,et al. Supervised Representation Learning: Transfer Learning with Deep Autoencoders , 2015, IJCAI.
[5] Richard Socher,et al. Dynamic Memory Networks for Visual and Textual Question Answering , 2016, ICML.
[6] Jianmin Wang,et al. Semantics-preserving hashing for cross-view retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Qi Gao,et al. Analyzing Cross-System User Modeling on the Social Web , 2011, ICWE.
[8] Nikos Paragios,et al. Data fusion through cross-modality metric learning using similarity-sensitive hashing , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[9] Hal Daumé,et al. Frustratingly Easy Domain Adaptation , 2007, ACL.
[10] G. C. Tiao,et al. Bayesian inference in statistical analysis , 1973 .
[11] Meng Wang,et al. Cross-Modality Feature Learning via Convolutional Autoencoder , 2019, ACM Trans. Multim. Comput. Commun. Appl..
[12] Rajat Raina,et al. Efficient sparse coding algorithms , 2006, NIPS.
[13] Mohamed R. Amer,et al. Multimodal fusion using dynamic hybrid models , 2014, IEEE Winter Conference on Applications of Computer Vision.
[14] Nitish Srivastava,et al. Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..
[15] Graham W. Taylor,et al. Deep Multimodal Learning: A Survey on Recent Advances and Trends , 2017, IEEE Signal Processing Magazine.
[16] Ke Zhang,et al. Video Summarization with Long Short-Term Memory , 2016, ECCV.
[17] Lei Zhang,et al. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[18] Kenji Fukumizu,et al. Equivalence of distance-based and RKHS-based statistics in hypothesis testing , 2012, ArXiv.
[19] Dacheng Tao,et al. Robust Face Recognition via Multimodal Deep Face Representation , 2015, IEEE Transactions on Multimedia.
[20] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.
[21] Subramanian Ramanathan,et al. No Matter Where You Are: Flexible Graph-Guided Multi-task Learning for Multi-view Head Pose Classification under Target Motion , 2013, 2013 IEEE International Conference on Computer Vision.
[22] François Laviolette,et al. Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..
[23] Christian Wolf,et al. ModDrop: Adaptive Multi-Modal Gesture Recognition , 2014, IEEE Trans. Pattern Anal. Mach. Intell..
[24] Qiang Yang,et al. Can Movies and Books Collaborate? Cross-Domain Collaborative Filtering for Sparsity Reduction , 2009, IJCAI.
[25] Changsheng Xu,et al. Cross-Domain Collaborative Learning in Social Multimedia , 2015, ACM Multimedia.
[26] Mubarak Shah,et al. Query-Focused Extractive Video Summarization , 2016, ECCV.
[27] Richard S. Zemel,et al. Exploring Models and Data for Image Question Answering , 2015, NIPS.
[28] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .
[29] Kate Saenko,et al. Return of Frustratingly Easy Domain Adaptation , 2015, AAAI.
[30] Zhongqi Lu,et al. Selective Transfer Learning for Cross Domain Recommendation , 2012, SDM.
[31] Zhibin Hong,et al. Tracking via Robust Multi-task Multi-view Joint Sparse Representation , 2013, 2013 IEEE International Conference on Computer Vision.
[32] Costanza Navarretta,et al. Transfer learning in multimodal corpora , 2013, 2013 IEEE 4th International Conference on Cognitive Infocommunications (CogInfoCom).
[33] Heng Ji,et al. Event Specific Multimodal Pattern Mining for Knowledge Base Construction , 2016, ACM Multimedia.
[34] Yao Hu,et al. Iterative Multi-View Hashing for Cross Media Indexing , 2014, ACM Multimedia.
[35] Qi Wu,et al. FVQA: Fact-Based Visual Question Answering , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[36] Guiguang Ding,et al. Collective Matrix Factorization Hashing for Multimodal Data , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[37] Trevor Darrell,et al. Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding , 2016, EMNLP.
[38] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[39] Wu-Jun Li,et al. Deep Cross-Modal Hashing , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[40] Yu Zheng,et al. Urban Water Quality Prediction Based on Multi-Task Multi-View Learning , 2016, IJCAI.
[41] Qiang Yang,et al. A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.
[42] Bernt Schiele,et al. Generative Adversarial Text to Image Synthesis , 2016, ICML.
[43] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[44] Wei Liu,et al. Self-Supervised Adversarial Hashing Networks for Cross-Modal Retrieval , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[45] Yash Goyal,et al. Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[46] Fei-Fei Li,et al. Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[47] Tao Mei,et al. Highlight Detection with Pairwise Deep Ranking for First-Person Video Summarization , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[48] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[49] Wenwu Zhu,et al. Incorporating External Knowledge to Answer Open-Domain Visual Questions with Dynamic Memory Networks , 2017, ArXiv.
[50] Yi Zhen,et al. A probabilistic model for multimodal hash function learning , 2012, KDD.
[51] Byoung-Tak Zhang,et al. Multimodal Residual Learning for Visual QA , 2016, NIPS.
[52] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[53] Tao Mei,et al. Video Summarization by Learning Deep Side Semantic Embedding , 2019, IEEE Transactions on Circuits and Systems for Video Technology.
[54] Juhan Nam,et al. Multimodal Deep Learning , 2011, ICML.
[55] Daniel Roggen,et al. Deep Convolutional and LSTM Recurrent Neural Networks for Multimodal Wearable Activity Recognition , 2016, Sensors.
[56] Meng Wang,et al. Event Driven Web Video Summarization by Tag Localization and Key-Shot Identification , 2012, IEEE Transactions on Multimedia.
[57] Haohan Wang,et al. Multimodal Transfer Deep Learning with Applications in Audio-Visual Recognition , 2014 .
[58] Raghavendra Udupa,et al. Learning Hash Functions for Cross-View Similarity Search , 2011, IJCAI.
[59] Tao Mei,et al. SocialTransfer: cross-domain transfer learning from social streams for media applications , 2012, ACM Multimedia.
[60] Tony Jebara,et al. Multitask Sparsity via Maximum Entropy Discrimination , 2011, J. Mach. Learn. Res..
[61] Daoqiang Zhang,et al. Multimodal Multi-label Transfer Learning for Early Diagnosis of Alzheimer's Disease , 2015, MLMI.
[62] Victor S. Lempitsky,et al. Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.
[63] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[64] Xiaolong Jin,et al. Cross-Domain Recommendation: An Embedding and Mapping Approach , 2017, IJCAI.
[65] Shiliang Sun,et al. A survey of multi-view machine learning , 2013, Neural Computing and Applications.
[66] Seungjin Choi,et al. Sequential Spectral Learning to Hash with Multiple Representations , 2012, ECCV.
[67] Yoshua Bengio,et al. Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach , 2011, ICML.
[68] Philip S. Yu,et al. Deep Visual-Semantic Hashing for Cross-Modal Retrieval , 2016, KDD.
[69] Tao Mei,et al. A Bag-of-Importance Model With Locality-Constrained Coding Based Feature Learning for Video Summarization , 2014, IEEE Transactions on Multimedia.
[70] Nitish Srivastava,et al. Learning Representations for Multimodal Data with Deep Belief Nets , 2012 .
[71] Lakhmi C. Jain,et al. Introduction to Bayesian Networks , 2008 .
[72] Stefan Carlsson,et al. CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.
[73] Massimiliano Pontil,et al. Regularized multi--task learning , 2004, KDD.
[74] Shou-De Lin,et al. A Transfer Probabilistic Collective Factorization Model to Handle Sparse Data in Collaborative Filtering , 2014, 2014 IEEE International Conference on Data Mining.
[75] Stan Z. Li,et al. Shared representation learning for heterogenous face recognition , 2014, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).
[76] Yi Zhen,et al. Co-Regularized Hashing for Multimodal Data , 2012, NIPS.
[77] Sarah Parisot,et al. Learning Conditioned Graph Structures for Interpretable Visual Question Answering , 2018, NeurIPS.
[78] Trevor Darrell,et al. Simultaneous Deep Transfer Across Domains and Tasks , 2015, ICCV.
[79] Zi Huang,et al. Inter-media hashing for large-scale retrieval from heterogeneous data sources , 2013, SIGMOD '13.
[80] Xiaoqing Feng,et al. Multimodal video classification with stacked contractive autoencoders , 2016, Signal Process..
[81] Trevor Darrell,et al. Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[82] Yuxin Peng,et al. SCH-GAN: Semi-Supervised Cross-Modal Hashing by Generative Adversarial Network , 2018, IEEE Transactions on Cybernetics.
[83] Jiebo Luo,et al. Towards Scalable Summarization of Consumer Videos Via Sparse Dictionary Selection , 2012, IEEE Transactions on Multimedia.
[84] Tao Mei,et al. Multi-level Attention Networks for Visual Question Answering , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[85] Dongqing Zhang,et al. Large-Scale Supervised Multimodal Hashing with Semantic Correlation Maximization , 2014, AAAI.
[86] Mohan S. Kankanhalli,et al. Automatic music video summarization based on audio-visual-text analysis and alignment , 2005, SIGIR '05.
[87] Michael I. Jordan,et al. Deep Transfer Learning with Joint Adaptation Networks , 2016, ICML.
[88] Yuxin Peng,et al. Better and Faster: Knowledge Transfer from Multiple Self-supervised Learning Tasks via Graph Distillation for Video Classification , 2018, IJCAI.
[89] Christos Faloutsos,et al. MMSS : graph-based multi-modal story-oriented video summarization and retrieval , 2004 .
[90] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[91] Zhou Yu,et al. Discriminative coupled dictionary hashing for fast cross-media retrieval , 2014, SIGIR.
[92] Ling Shao,et al. Deep Dynamic Neural Networks for Multimodal Gesture Segmentation and Recognition , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[93] Chunhua Shen,et al. Explicit Knowledge-based Reasoning for Visual Question Answering , 2015, IJCAI.
[94] Trevor Darrell,et al. Deep Domain Confusion: Maximizing for Domain Invariance , 2014, CVPR 2014.
[95] Nicholas Jing Yuan,et al. Little Is Much: Bridging Cross-Platform Behaviors through Overlapped Crowds , 2016, AAAI.
[96] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[97] Arthur P. Dempster,et al. A Generalization of Bayesian Inference , 1968, Classic Works of the Dempster-Shafer Theory of Belief Functions.
[98] M. Shamim Hossain,et al. Cross-Platform Multi-Modal Topic Modeling for Personalized Inter-Platform Recommendation , 2015, IEEE Transactions on Multimedia.
[99] Zhou Yu,et al. Multi-modal Factorized Bilinear Pooling with Co-attention Learning for Visual Question Answering , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[100] Heng Ji,et al. Cross-media Event Extraction and Recommendation , 2016, NAACL.
[101] Matthieu Cord,et al. MUTAN: Multimodal Tucker Fusion for Visual Question Answering , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[102] Dhruv Batra,et al. Don't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[103] Lei Zhang,et al. PatternNet: Visual Pattern Mining with Deep Neural Network , 2018, ICMR.
[104] Changsheng Xu,et al. Unified YouTube Video Recommendation via Cross-network Collaboration , 2015, ICMR.
[105] Hui Chen,et al. TLRec:Transfer Learning for Cross-Domain Recommendation , 2017, 2017 IEEE International Conference on Big Knowledge (ICBK).
[106] Kristen Grauman,et al. Diverse Sequential Subset Selection for Supervised Video Summarization , 2014, NIPS.
[107] Juan Carlos Niebles,et al. Graph Distillation for Action Detection with Privileged Modalities , 2017, ECCV.
[108] Guiguang Ding,et al. Latent semantic sparse hashing for cross-modal similarity search , 2014, SIGIR.
[109] Yizhou Wang,et al. Quantized Correlation Hashing for Fast Cross-Modal Search , 2015, IJCAI.
[110] Heng Ji,et al. Improving Event Extraction via Multimodal Integration , 2017, ACM Multimedia.
[111] Victor Lavrenko,et al. Regularised Cross-Modal Hashing , 2015, SIGIR.
[112] Petros Maragos,et al. Video event detection and summarization using audio, visual and text saliency , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[113] Liang Wang,et al. Unconstrained Multimodal Multi-Label Learning , 2015, IEEE Transactions on Multimedia.
[114] David Mascharka,et al. Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[115] Heng Ji,et al. Cross-document Event Coreference Resolution based on Cross-media Features , 2015, EMNLP.
[116] Massimiliano Pontil,et al. Convex multi-task feature learning , 2008, Machine Learning.
[117] Alexander J. Smola,et al. Stacked Attention Networks for Image Question Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[118] Qi Wu,et al. Visual Question Answering: A Tutorial , 2017, IEEE Signal Processing Magazine.
[119] George Trigeorgis,et al. Domain Separation Networks , 2016, NIPS.
[120] Kilian Q. Weinberger,et al. Marginalized Denoising Autoencoders for Domain Adaptation , 2012, ICML.
[121] Meng Wang,et al. Topic driven multimodal similarity learning with multi-view voted convolutional features , 2018, Pattern Recognit..
[122] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[123] Zhou Yu,et al. Sparse Multi-Modal Hashing , 2014, IEEE Transactions on Multimedia.
[124] Roksana Boreli,et al. Is more always merrier?: a deep dive into online social footprints , 2012, WOSN '12.
[125] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[126] H. McGurk,et al. Hearing lips and seeing voices , 1976, Nature.
[127] Yale Song,et al. TVSum: Summarizing web videos using titles , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[128] Yu Zhang,et al. A Survey on Multi-Task Learning , 2017, IEEE Transactions on Knowledge and Data Engineering.
[129] Xuelong Li,et al. Deep Binary Reconstruction for Cross-Modal Hashing , 2017, IEEE Transactions on Multimedia.
[130] Changsheng Xu,et al. Mining Cross-network Association for YouTube Video Promotion , 2014, ACM Multimedia.
[131] Masahiro Suzuki,et al. Joint Multimodal Learning with Deep Generative Models , 2016, ICLR.
[132] Dan Klein,et al. Neural Module Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[133] Fei Wang,et al. Composite hashing with multiple information sources , 2011, SIGIR.
[134] Li Fei-Fei,et al. Inferring and Executing Programs for Visual Reasoning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[135] Ling Shao,et al. Cycle-Consistent Deep Generative Hashing for Cross-Modal Retrieval , 2018, IEEE Transactions on Image Processing.
[136] Jun Wang,et al. Comparing apples to oranges: a scalable solution with heterogeneous hashing , 2013, KDD.
[137] Jiasen Lu,et al. Hierarchical Question-Image Co-Attention for Visual Question Answering , 2016, NIPS.
[138] Jingrui He,et al. A Graphbased Framework for Multi-Task Multi-View Learning , 2011, ICML.
[139] Mohamed R. Amer,et al. Deep Multimodal Fusion: A Hybrid Approach , 2017, International Journal of Computer Vision.
[140] Degui Xiao,et al. Medical Image Retrieval: A Multimodal Approach , 2014, Cancer informatics.
[141] Wenwu Zhu,et al. Deep Asymmetric Transfer Network for Unbalanced Domain Adaptation , 2018, AAAI.
[142] Antonio Torralba,et al. Spectral Hashing , 2008, NIPS.
[143] Zi Huang,et al. Linear cross-modal hashing for efficient multimedia search , 2013, ACM Multimedia.
[144] Jiwen Lu,et al. Cross-Modal Deep Variational Hashing , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[145] Jitendra Malik,et al. Cross Modal Distillation for Supervision Transfer , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[146] Ambedkar Dukkipati,et al. Variational methods for conditional multimodal deep learning , 2016, 2017 International Joint Conference on Neural Networks (IJCNN).
[147] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[148] Andrew Zisserman,et al. Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.
[149] Yong Jae Lee,et al. Discovering important people and objects for egocentric video summarization , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[150] Boqing Gong,et al. Query-Focused Video Summarization: Dataset, Evaluation, and a Memory Network Based Approach , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[151] Zi Huang,et al. Effective Multiple Feature Hashing for Large-Scale Near-Duplicate Video Retrieval , 2013, IEEE Transactions on Multimedia.
[152] Razvan Pascanu,et al. Combining modality specific deep neural networks for emotion recognition in video , 2013, ICMI '13.
[153] Liang Lin,et al. Visual Question Reasoning on General Dependency Tree , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[154] Zi Huang,et al. Multiple feature hashing for real-time large scale near-duplicate video retrieval , 2011, ACM Multimedia.
[155] Sebastian Ruder,et al. An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.
[156] Wenwu Zhu,et al. Learning Compact Hash Codes for Multimodal Representations Using Orthogonal Deep Structure , 2015, IEEE Transactions on Multimedia.
[157] Daoqiang Zhang,et al. Multimodal manifold-regularized transfer learning for MCI conversion prediction , 2015, Brain Imaging and Behavior.
[158] Anton van den Hengel,et al. Graph-Structured Representations for Visual Question Answering , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[159] Chong-Wah Ngo,et al. Scalable Visual Instance Mining with Threads of Features , 2014, ACM Multimedia.
[160] Michael I. Jordan,et al. Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.
[161] Dacheng Tao,et al. A Survey on Multi-view Learning , 2013, ArXiv.
[162] Trevor Darrell,et al. Learning to Reason: End-to-End Module Networks for Visual Question Answering , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[163] Xin Wang,et al. Disparity-preserved Deep Cross-platform Association for Cross-platform Video Recommendation , 2019, IJCAI.
[164] Yoshua Bengio,et al. How transferable are features in deep neural networks? , 2014, NIPS.
[165] Jürgen Schmidhuber,et al. Multimodal Similarity-Preserving Hashing , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[166] Armand Joulin,et al. Deep Fragment Embeddings for Bidirectional Image Sentence Mapping , 2014, NIPS.
[167] Jianmin Wang,et al. Correlation Autoencoder Hashing for Supervised Cross-Modal Search , 2016, ICMR.