Domain Adaptation As a Problem of Inference on Graphical Models

This paper is concerned with data-driven unsupervised domain adaptation, where it is unknown in advance how the joint distribution changes across domains, i.e., what factors or modules of the data distribution remain invariant or change across domains. To develop an automated way of domain adaptation with multiple source domains, we propose to use a graphical model as a compact way to encode the change property of the joint distribution, which can be learned from data, and then view domain adaptation as a problem of Bayesian inference on the graphical models. Such a graphical model distinguishes between constant and varied modules of the distribution and specifies the properties of the changes across domains, which serves as prior knowledge of the changing modules for the purpose of deriving the posterior of the target variable $Y$ in the target domain. This provides an end-to-end framework of domain adaptation, in which additional knowledge about how the joint distribution changes, if available, can be directly incorporated to improve the graphical representation. We discuss how causality-based domain adaptation can be put under this umbrella. Experimental results on both synthetic and real data demonstrate the efficacy of the proposed framework for domain adaptation. The code is available at this https URL .

[1]  Jun Zhu,et al.  Conditional Generative Moment-Matching Networks , 2016, NIPS.

[2]  I. Guyon,et al.  Causal Generative Neural Networks , 2017, 1711.08936.

[3]  Elias Bareinboim,et al.  Transportability of Causal and Statistical Relations: A Formal Approach , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[4]  Michael I. Jordan,et al.  Deep Transfer Learning with Joint Adaptation Networks , 2016, ICML.

[5]  Bernhard Schölkopf,et al.  Behind Distribution Shift: Mining Driving Forces of Changes and Causal Arrows , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[6]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[7]  Yishay Mansour,et al.  Domain Adaptation with Multiple Sources , 2008, NIPS.

[8]  Neil D. Lawrence,et al.  When Training and Test Sets Are Different: Characterizing Learning Transfer , 2009 .

[9]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[10]  Kamyar Azizzadenesheli,et al.  Regularized Learning for Domain Adaptation under Label Shifts , 2019, ICLR.

[11]  Chuan Chen,et al.  Learning Semantic Representations for Unsupervised Domain Adaptation , 2018, ICML.

[12]  Gregory F. Cooper,et al.  A Simple Constraint-Based Algorithm for Efficiently Mining Observational Databases for Causal Relationships , 1997, Data Mining and Knowledge Discovery.

[13]  Kristen Grauman,et al.  Connecting the Dots with Landmarks: Discriminatively Learning Domain-Invariant Features for Unsupervised Domain Adaptation , 2013, ICML.

[14]  Qiang Ji,et al.  Local Causal Discovery of Direct Causes and Effects , 2015, NIPS.

[15]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[16]  Ruth Urner,et al.  Domain adaptation–can quantity compensate for quality? , 2013, Annals of Mathematics and Artificial Intelligence.

[17]  Ivor W. Tsang,et al.  Domain Adaptation via Transfer Component Analysis , 2009, IEEE Transactions on Neural Networks.

[18]  Sunita Sarawagi,et al.  Maximum Mean Discrepancy for Class Ratio Estimation: Convergence Bounds and Kernel Selection , 2014, ICML.

[19]  M. Kawanabe,et al.  Direct importance estimation for covariate shift adaptation , 2008 .

[20]  Dacheng Tao,et al.  Causal Generative Domain Adaptation Networks , 2018, ArXiv.

[21]  Kate Saenko,et al.  Deep CORAL: Correlation Alignment for Deep Domain Adaptation , 2016, ECCV Workshops.

[22]  Ian J. Wassell,et al.  Re-weighted Adversarial Adaptation Network for Unsupervised Domain Adaptation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23]  R. Brinkman,et al.  High-content flow cytometry and temporal data analysis for defining a cellular signature of graft-versus-host disease. , 2007, Biology of blood and marrow transplantation : journal of the American Society for Blood and Marrow Transplantation.

[24]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[25]  Stephan Günnemann,et al.  Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift , 2018, NeurIPS.

[26]  Thomas G. Dietterich,et al.  To transfer or not to transfer , 2005, NIPS 2005.

[27]  Bernhard Schölkopf,et al.  Domain Adaptation with Conditional Transferable Components , 2016, ICML.

[28]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[29]  Bernhard Schölkopf,et al.  Multi-Source Domain Adaptation: A Causal View , 2015, AAAI.

[30]  Judea Pearl,et al.  Equivalence and Synthesis of Causal Models , 1990, UAI.

[31]  Bernhard Schölkopf,et al.  Correcting Sample Selection Bias by Unlabeled Data , 2006, NIPS.

[32]  S. Lauritzen,et al.  Chain graph models and their causal interpretations , 2002 .

[33]  Bernhard Schölkopf,et al.  Domain Adaptation under Target and Conditional Shift , 2013, ICML.

[34]  K. Fukumizu,et al.  Kernel Embeddings of Conditional Distributions: A Unified Kernel Framework for Nonparametric Inference in Graphical Models , 2013, IEEE Signal Process. Mag..

[35]  Bernhard Schölkopf,et al.  On causal and anticausal learning , 2012, ICML.

[36]  Amos Storkey,et al.  When Training and Test Sets are Different: Characterising Learning Transfer , 2013 .

[37]  Ivan Marsic,et al.  Covariate Shift in Hilbert Space: A Solution via Sorrogate Kernels , 2013, ICML.

[38]  Jiawei Han,et al.  Knowledge transfer via multiple model local structure mapping , 2008, KDD.

[39]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[40]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[41]  George Trigeorgis,et al.  Domain Separation Networks , 2016, NIPS.

[42]  H. Shimodaira,et al.  Improving predictive inference under covariate shift by weighting the log-likelihood function , 2000 .

[43]  Nicolas Courty,et al.  Match and Reweight Strategy for Generalized Target Shift , 2020, ArXiv.

[44]  Michèle Sebag,et al.  Learning Functional Causal Models with Generative Neural Networks , 2018 .

[45]  Kun Zhang,et al.  Twin Auxilary Classifiers GAN , 2019, NeurIPS.

[46]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[47]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[48]  Ivor W. Tsang,et al.  Domain adaptation from multiple sources via auxiliary classifiers , 2009, ICML '09.

[49]  Alexandros G. Dimakis,et al.  CausalGAN: Learning Causal Implicit Generative Models with Adversarial Training , 2017, ICLR.

[50]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[51]  Marco Loog,et al.  Semi-Generative Modelling: Covariate-Shift Adaptation with Cause and Effect Features , 2018, AISTATS.

[52]  Bianca Zadrozny,et al.  Learning and evaluating classifiers under sample selection bias , 2004, ICML.

[53]  Gilles Blanchard,et al.  Generalizing from Several Related Classification Tasks to a New Unlabeled Sample , 2011, NIPS.

[54]  Brian C. Lovell,et al.  Unsupervised Domain Adaptation by Domain Invariant Projection , 2013, 2013 IEEE International Conference on Computer Vision.

[55]  Zaïd Harchaoui,et al.  A Fast, Consistent Kernel Two-Sample Test , 2009, NIPS.

[56]  Joris M. Mooij,et al.  Domain Adaptation by Using Causal Inference to Predict Invariant Conditional Distributions , 2017, NeurIPS.

[57]  Sethuraman Panchanathan,et al.  Multi-source domain adaptation and its application to early detection of fatigue , 2011, KDD.

[58]  Masashi Sugiyama,et al.  Semi-Supervised Learning of Class Balance under Class-Prior Change by Distribution Matching , 2012, ICML.

[59]  Jin Tian,et al.  Recovering from Selection Bias in Causal and Statistical Inference , 2014, AAAI.

[60]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[61]  Richard S. Zemel,et al.  Generative Moment Matching Networks , 2015, ICML.

[62]  Judea Pearl,et al.  A Probabilistic Calculus of Actions , 1994, UAI.

[63]  Suchi Saria,et al.  I-SPEC: An End-to-End Framework for Learning Transportable, Shift-Stable Models , 2020, ArXiv.

[64]  Bernhard Schölkopf,et al.  Causal Discovery from Heterogeneous/Nonstationary Data , 2019, J. Mach. Learn. Res..

[65]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[66]  José M. F. Moura,et al.  Adversarial Multiple Source Domain Adaptation , 2018, NeurIPS.

[67]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[68]  Bernhard Schölkopf,et al.  Kernel-based Conditional Independence Test and Application in Causal Discovery , 2011, UAI.

[69]  Ming-Yu Liu,et al.  Coupled Generative Adversarial Networks , 2016, NIPS.

[70]  Alexander J. Smola,et al.  Detecting and Correcting for Label Shift with Black Box Predictors , 2018, ICML.

[71]  Gang Niu,et al.  Rethinking Importance Weighting for Deep Learning under Distribution Shift , 2020, NeurIPS.

[72]  Philip S. Yu,et al.  Transfer Feature Learning with Joint Distribution Adaptation , 2013, 2013 IEEE International Conference on Computer Vision.

[73]  P. J. Green,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[74]  Yishay Mansour,et al.  Learning Bounds for Importance Weighting , 2010, NIPS.

[75]  Bernhard Schölkopf,et al.  Causal Discovery from Nonstationary/Heterogeneous Data: Skeleton Estimation and Orientation Determination , 2017, IJCAI.

[76]  Bernhard Schölkopf,et al.  Domain Generalization via Invariant Feature Representation , 2013, ICML.

[77]  Nicolas Courty,et al.  Joint distribution optimal transportation for domain adaptation , 2017, NIPS.