Learning Disentangled Factors from Paired Data in Cross-Modal Retrieval: An Implicit Identifiable VAE Approach
暂无分享,去创建一个
[1] Christopher Burgess,et al. beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.
[2] Max Welling,et al. VAE with a VampPrior , 2017, AISTATS.
[3] Nicola De Cao,et al. Hyperspherical Variational Auto-Encoders , 2018, UAI 2018.
[4] Ling Shao,et al. Hetero-Manifold Regularisation for Cross-Modal Hashing , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[5] Chong-Wah Ngo,et al. Cross-modal Recipe Retrieval with Rich Food Attributes , 2017, ACM Multimedia.
[6] Yann LeCun,et al. Disentangling factors of variation in deep representation using adversarial training , 2016, NIPS.
[7] Pieter Abbeel,et al. InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.
[8] Ruslan Salakhutdinov,et al. Learning Factorized Multimodal Representations , 2018, ICLR.
[9] Steven C. H. Hoi,et al. Learning Cross-Modal Embeddings With Adversarial Networks for Cooking Recipes and Food Images , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[11] Yoshua Bengio,et al. Learning Independent Features with Adversarial Nets for Non-linear ICA , 2017, 1710.05050.
[12] Navdeep Jaitly,et al. Adversarial Autoencoders , 2015, ArXiv.
[13] Abhishek Kumar,et al. Variational Inference of Disentangled Latent Concepts from Unlabeled Observations , 2017, ICLR.
[14] Philip S. Yu,et al. Deep Visual-Semantic Hashing for Cross-Modal Retrieval , 2016, KDD.
[15] Chao Zhang,et al. Deep Joint-Semantics Reconstructing Hashing for Large-Scale Unsupervised Cross-Modal Retrieval , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[16] Bernhard Schölkopf,et al. Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations , 2018, ICML.
[17] Andriy Mnih,et al. Disentangling by Factorising , 2018, ICML.
[18] Michael I. Jordan,et al. A Probabilistic Interpretation of Canonical Correlation Analysis , 2005 .
[19] Adi Ben-Israel,et al. The Change-of-Variables Formula Using Matrix Volume , 1999, SIAM J. Matrix Anal. Appl..
[20] Wu-Jun Li,et al. Deep Cross-Modal Hashing , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Harold Soh,et al. Hyperprior Induced Unsupervised Disentanglement of Latent Representations , 2018, AAAI.
[22] Chong-Wah Ngo,et al. Cross-Modal Recipe Retrieval: How to Cook this Dish? , 2017, MMM.
[23] Murray Shanahan,et al. SCAN: Learning Hierarchical Compositional Visual Concepts , 2017, ICLR.
[24] Vladimir Pavlovic,et al. Bayes-Factor-VAE: Hierarchical Bayesian Deep Auto-Encoder Models for Factor Disentanglement , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[25] Mike Wu,et al. Multimodal Generative Models for Scalable Weakly-Supervised Learning , 2018, NeurIPS.
[26] Honglak Lee,et al. Learning Structured Output Representation using Deep Conditional Generative Models , 2015, NIPS.
[27] James R. Glass,et al. Disentangling by Partitioning: A Representation Learning Framework for Multimodal Sensory Data , 2018, ArXiv.
[28] Aapo Hyvärinen,et al. Variational Autoencoders and Nonlinear ICA: A Unifying Framework , 2019, AISTATS.
[29] Amaia Salvador,et al. Learning Cross-Modal Embeddings for Cooking Recipes and Food Images , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Kevin Murphy,et al. Generative Models of Visually Grounded Imagination , 2017, ICLR.
[31] Simon Haykin,et al. GradientBased Learning Applied to Document Recognition , 2001 .
[32] Yuxin Peng,et al. CCL: Cross-modal Correlation Learning With Multigrained Fusion by Hierarchical Network , 2017, IEEE Transactions on Multimedia.
[33] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[34] Yee Whye Teh,et al. Disentangling Disentanglement in Variational Autoencoders , 2018, ICML.
[35] Vladimir Pavlovic,et al. Relevance Factor VAE: Learning and Identifying Disentangled Factors , 2019, ArXiv.
[36] Chong-Wah Ngo,et al. Deep Understanding of Cooking Procedure for Cross-modal Recipe Retrieval , 2018, ACM Multimedia.
[37] Antonio Torralba,et al. Recipe1M+: A Dataset for Learning Cross-Modal Embeddings for Cooking Recipes and Food Images , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[38] Jeff A. Bilmes,et al. Deep Canonical Correlation Analysis , 2013, ICML.
[39] Ling Shao,et al. Semi-supervised vision-language mapping via variational learning , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[40] Roger B. Grosse,et al. Isolating Sources of Disentanglement in Variational Autoencoders , 2018, NeurIPS.
[41] Christopher K. I. Williams,et al. A Framework for the Quantitative Evaluation of Disentangled Representations , 2018, ICLR.