论文信息 - Regularizing Generative Models Using Knowledge of Feature Dependence - 字舞流文

Regularizing Generative Models Using Knowledge of Feature Dependence

Generative modeling is a fundamental problem in machine learning with many potential applications. Efficient learning of generative models requires available prior knowledge to be exploited as much as possible. In this paper, we propose a method to exploit prior knowledge of relative dependence between features for learning generative models. Such knowledge is available, for example, when side-information on features is present. We incorporate the prior knowledge by forcing marginals of the learned generative model to follow a prescribed relative feature dependence. To this end, we formulate a regularization term using a kernel-based dependence criterion. The proposed method can be incorporated straightforwardly into many optimization-based learning schemes of generative models, including variational autoencoders and generative adversarial networks. We show the effectiveness of the proposed method in experiments with multiple types of datasets and models.

Naoya Takeishi | Yoshinobu Kawahara

[1] Nir Friedman,et al. Probabilistic Graphical Models - Principles and Techniques , 2009 .

[2] Lei Yu,et al. A Hybrid Collaborative Filtering Model with Deep Structure for Recommender Systems , 2017, AAAI.

[3] James R. Foulds,et al. Latent Topic Networks: A Versatile Probabilistic Programming Framework for Topic Models , 2015, ICML.

[4] Eric Xing,et al. Deep Generative Models with Learnable Knowledge Constraints , 2018, NeurIPS.

[5] Ning Chen,et al. Bayesian inference with posterior regularization and applications to infinite latent SVMs , 2012, J. Mach. Learn. Res..

[6] Hongzhe Li,et al. In Response to Comment on "Network-constrained regularization and variable selection for analysis of genomic data" , 2008, Bioinform..

[7] Marco Gori,et al. Bridging logic and kernel machines , 2011, Machine Learning.

[8] Rui Zhang,et al. Incorporating Knowledge Graph Embeddings into Topic Modeling , 2017, AAAI.

[9] Michael I. Jordan,et al. Information Constraints on Auto-Encoding Variational Bayes , 2018, NeurIPS.

[10] Bernhard Schölkopf,et al. Measuring Statistical Dependence with Hilbert-Schmidt Norms , 2005, ALT.

[11] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[12] Inderjit S. Dhillon,et al. Robust Principal Component Analysis with Side Information , 2016, ICML.

[13] E. F. Vogel,et al. A plant-wide industrial process control problem , 1993 .

[14] George A. Miller,et al. WordNet: A Lexical Database for English , 1995, HLT.

[15] Kevin P. Murphy,et al. Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[16] Jun Zhu,et al. Robust RegBayes: Selectively Incorporating First-Order Logic Domain Knowledge into Bayesian Models , 2014, ICML.

[17] Tong Zhang,et al. A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[18] Matthew Richardson,et al. Markov logic networks , 2006, Machine Learning.

[19] John Blitzer,et al. Regularized Learning with Networks of Features , 2008, NIPS.

[20] Arthur Gretton,et al. A low variance consistent test of relative dependency , 2015, ICML.

[21] Zenglin Xu,et al. Learning with Feature Network and Label Network Simultaneously , 2017, AAAI.

[22] Diyi Yang,et al. Incorporating Word Correlation Knowledge into Topic Modeling , 2015, NAACL.

[23] Yu Hu,et al. Learning Semantic Word Embeddings based on Ordinal Knowledge Constraints , 2015, ACL.

[24] A. Kraskov,et al. Estimating mutual information. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[25] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[26] Jun Sakuma,et al. Fairness-Aware Classifier with Prejudice Remover Regularizer , 2012, ECML/PKDD.

[27] Maria L. Rizzo,et al. Measuring and testing dependence by correlation of distances , 2007, 0803.4101.

[28] Guoyin Wang,et al. Joint Embedding of Words and Labels for Text Classification , 2018, ACL.

[29] Yoshua Bengio,et al. Learning deep representations by mutual information estimation and maximization , 2018, ICLR.

[30] Eric P. Xing,et al. Grounding Topic Models with Knowledge Bases , 2016, IJCAI.

[31] Aaron C. Courville,et al. MINE: Mutual Information Neural Estimation , 2018, ArXiv.

[32] Naftali Tishby,et al. Learning to Select Features using their Properties , 2008 .

[33] Junzhou Huang,et al. Learning with structured sparsity , 2009, ICML '09.

[34] Le Song,et al. Feature Selection via Dependence Maximization , 2012, J. Mach. Learn. Res..

[35] Jude W. Shavlik,et al. Knowledge-Based Kernel Approximation , 2004, J. Mach. Learn. Res..

[36] Eric P. Xing,et al. Harnessing Deep Neural Networks with Logic Rules , 2016, ACL.

[37] Ben Taskar,et al. Posterior Regularization for Structured Latent Variable Models , 2010, J. Mach. Learn. Res..

[38] Koray Kavukcuoglu,et al. Pixel Recurrent Neural Networks , 2016, ICML.

[39] Yukihiro Tadokoro,et al. Structured Denoising Autoencoder for Fault Detection and Analysis , 2014, ACML.

[40] Naoya Takeishi,et al. Knowledge-Based Distant Regularization in Learning Probabilistic Models , 2018, ArXiv.

[41] Michael Mitzenmacher,et al. Detecting Novel Associations in Large Data Sets , 2011, Science.

[42] Yoshua Bengio,et al. Mutual Information Neural Estimation , 2018, ICML.

[43] Mikhail Belkin,et al. Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[44] Kristian Kersting,et al. Markov Logic Mixtures of Gaussian Processes: Towards Machines Reading Regression Data , 2012, AISTATS.

[45] Alexandros Kalousis,et al. Regularising Non-linear Models Using Feature Side-information , 2017, ICML.

[46] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[47] Max Welling,et al. Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.

[48] Hugo Larochelle,et al. The Neural Autoregressive Distribution Estimator , 2011, AISTATS.

[49] Max Welling,et al. The Variational Fair Autoencoder , 2015, ICLR.

[50] Samy Bengio,et al. Density estimation using Real NVP , 2016, ICLR.

[51] David Pfau,et al. Unrolled Generative Adversarial Networks , 2016, ICLR.

[52] Richard S. Zemel,et al. Generative Moment Matching Networks , 2015, ICML.

[53] Philip S. Yu,et al. A Comprehensive Survey on Graph Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[54] B. Schölkopf,et al. Kernel‐based tests for joint independence , 2016, 1603.00285.

[55] Samy Bengio,et al. Large-Scale Object Classification Using Label Relation Graphs , 2014, ECCV.

[56] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[57] Marco Gori,et al. Semantic-based regularization for learning and inference , 2017, Artif. Intell..

[58] Valero Laparra,et al. Fair Kernel Learning , 2017, ECML/PKDD.

[59] Xiaojin Zhu,et al. Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence A Framework for Incorporating General Domain Knowledge into Latent Dirichlet Allocation Using First-Order Logic , 2022 .

[60] Baogang Wei,et al. Incorporating Probabilistic Knowledge into Topic Models , 2015, PAKDD.

[61] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[62] Naftali Tishby,et al. Incorporating Prior Knowledge on Features into Learning , 2007, AISTATS.

[63] Arthur Gretton,et al. Large-scale kernel methods for independence testing , 2016, Stat. Comput..