Unpaired Multi-Domain Causal Representation Learning

The goal of causal representation learning is to find a representation of data that consists of causally related latent variables. We consider a setup where one has access to data from multiple domains that potentially share a causal representation. Crucially, observations in different domains are assumed to be unpaired, that is, we only observe the marginal distribution in each domain but not their joint distribution. In this paper, we give sufficient conditions for identifiability of the joint distribution and the shared causal graph in a linear setup. Identifiability holds if we can uniquely recover the joint distribution and the shared causal representation from the marginal distributions in each domain. We transform our identifiability results into a practical method to recover the shared latent causal graph.

[1]  P. Chouvardas,et al.  Matching single cells across modalities with contrastive learning and optimal transport , 2023, Briefings Bioinform..

[2]  A. Seigal,et al.  Linear Causal Disentanglement via Interventions , 2022, ICML.

[3]  A. Regev,et al.  Learning Causal Representations of Single Cells via Sparse Mechanism Shift Modeling , 2022, CLeaR.

[4]  P. Spirtes,et al.  Independence Testing-Based Approach to Causal Discovery under Measurement Error and Linear Non-Gaussian Models , 2022, Neural Information Processing Systems.

[5]  Patrick Forr'e,et al.  Multi-View Independent Component Analysis with Shared and Individual Sources , 2022, UAI.

[6]  C. Glymour,et al.  Latent Hierarchical Causal Structure Discovery with Rank Constraints , 2022, NeurIPS.

[7]  Y. Bengio,et al.  Interventional Causal Representation Learning , 2022, ICML.

[8]  Mingming Gong,et al.  Identifying Weight-Variant Latent Causal Models , 2022, 2208.14153.

[9]  B. Schölkopf,et al.  Function Classes for Identifiable Nonlinear Independent Component Analysis , 2022, NeurIPS.

[10]  Zhana Duren,et al.  Integration of single-cell multi-omics data by regression analysis on unpaired observations , 2022, Genome Biology.

[11]  Piotr Zwiernik,et al.  Non-Independent Components Analysis , 2022, 2206.13668.

[12]  Kun Zhang,et al.  Identification of Linear Latent Variable Model with Arbitrary Distribution , 2022, AAAI Conference on Artificial Intelligence.

[13]  Lin Wan,et al.  A unified computational framework for single-cell data integration with optimal transport , 2022, bioRxiv.

[14]  Wing Hong Wong,et al.  scJoint integrates atlas-scale single-cell RNA-seq and ATAC-seq data with transfer learning , 2022, Nature Biotechnology.

[15]  Aapo Hyvärinen,et al.  Shared Independent Component Analysis for Multi-Subject Neuroimaging , 2021, NeurIPS.

[16]  Timothy M. Hospedales,et al.  Self-Supervised Representation Learning: Introduction, advances, and challenges , 2021, IEEE Signal Processing Magazine.

[17]  Nan Rosemary Ke,et al.  Toward Causal Representation Learning , 2021, Proceedings of the IEEE.

[18]  C. Glymour,et al.  Generalized Independent Noise Condition for Estimating Latent Variable Causal Graphs , 2020, NeurIPS.

[19]  Shohei Shimizu,et al.  Causal Discovery with Multi-Domain LiNGAM for Latent Factors , 2020, IJCAI.

[20]  Luke Metz,et al.  On Linear Identifiability of Learned Representations , 2020, ICML.

[21]  Zhitang Chen,et al.  CausalVAE: Disentangled Representation Learning via Neural Structural Causal Models , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Tom Lyche,et al.  Numerical Linear Algebra and Matrix Factorizations , 2020 .

[23]  Anastasiya Belyaeva,et al.  Multi-domain translation between single-cell imaging and sequencing data using autoencoders , 2019, Nature Communications.

[24]  Hui Xiong,et al.  A Comprehensive Survey on Transfer Learning , 2019, Proceedings of the IEEE.

[25]  Aapo Hyvärinen,et al.  Variational Autoencoders and Nonlinear ICA: A Unifying Framework , 2019, AISTATS.

[26]  Samuel Demharter,et al.  Joint analysis of heterogeneous single-cell RNA-seq dataset collections , 2019, Nature Methods.

[27]  William Stafford Noble,et al.  Jointly Embedding Multiple Single-Cell Omics Measurements , 2019, bioRxiv.

[28]  Caroline Uhler,et al.  Multi-Domain Translation by Learning Uncoupled Autoencoders , 2019, ArXiv.

[29]  Vince D. Calhoun,et al.  Extraction of Time-Varying Spatiotemporal Networks Using Parameter-Tuned Constrained IVA , 2019, IEEE Transactions on Medical Imaging.

[30]  Christoph Hafemeister,et al.  Comprehensive integration of single cell data , 2018, bioRxiv.

[31]  Pietro Perona,et al.  Recognition in Terra Incognita , 2018, ECCV.

[32]  Anne E Carpenter,et al.  CellProfiler 3.0: Next-generation image processing for biology , 2018, PLoS biology.

[33]  Yong Wang,et al.  Integrative analysis of single-cell genomics data by coupled nonnegative matrix factorizations , 2018, Proceedings of the National Academy of Sciences.

[34]  Paul Hoffman,et al.  Integrating single-cell transcriptomic data across different conditions, technologies, and species , 2018, Nature Biotechnology.

[35]  Y Samuel Wang,et al.  High-dimensional causal discovery under non-Gaussianity , 2018, Biometrika.

[36]  Smita Krishnaswamy,et al.  MAGAN: Aligning Biological Manifolds , 2018, ICML.

[37]  Joshua D. Welch,et al.  MATCHER: manifold alignment reveals correspondence between single cell transcriptome and epigenome dynamics , 2017, Genome Biology.

[38]  Jean Gotman,et al.  Validation of Shared and Specific Independent Component Analysis (SSICA) for Between-Group Comparisons in fMRI , 2016, Front. Neurosci..

[39]  P. Matthews,et al.  Multimodal population brain imaging in the UK Biobank prospective epidemiological study , 2016, Nature Neuroscience.

[40]  Samuel Kaski,et al.  Group Factor Analysis , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[41]  William D. Marslen-Wilson,et al.  The Cambridge Centre for Ageing and Neuroscience (Cam-CAN) study protocol: a cross-sectional, lifespan, multidisciplinary examination of healthy cognitive ageing , 2014, BMC Neurology.

[42]  Arie Yeredor,et al.  Joint Matrices Decompositions and Blind Source Separation: A survey of methods, identification, and applications , 2014, IEEE Signal Processing Magazine.

[43]  Essa Yacoub,et al.  The WU-Minn Human Connectome Project: An overview , 2013, NeuroImage.

[44]  Ronald Phlypo,et al.  Independent Vector Analysis: Identification Conditions and Performance Bounds , 2013, IEEE Transactions on Signal Processing.

[45]  Aapo Hyvärinen,et al.  Testing Independent Component Patterns by Inter-Subject or Inter-Session Consistency , 2013, Front. Hum. Neurosci..

[46]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Tülay Adali,et al.  Joint blind source separation by generalized joint diagonalization of cumulant matrices , 2011, Signal Process..

[48]  Aapo Hyvärinen,et al.  DirectLiNGAM: A Direct Method for Learning a Linear Non-Gaussian Structural Equation Model , 2011, J. Mach. Learn. Res..

[49]  Pierre Comon,et al.  Handbook of Blind Source Separation: Independent Component Analysis and Applications , 2010 .

[50]  J B Poline,et al.  CanICA: Model-based extraction of reproducible group-level ICA patterns from fMRI time series , 2009, ArXiv.

[51]  Vince D. Calhoun,et al.  An ICA-based method for the identification of optimal FMRI features and components using combined group-discriminative techniques , 2009, NeuroImage.

[52]  Vince D. Calhoun,et al.  A review of group ICA for fMRI data and ICA for joint inference of imaging, genetic, and ERP data , 2009, NeuroImage.

[53]  Richard Scheines,et al.  Learning the Structure of Linear Latent Variable Models , 2006, J. Mach. Learn. Res..

[54]  Te-Won Lee,et al.  Independent Vector Analysis: Definition and Algorithms , 2006, 2006 Fortieth Asilomar Conference on Signals, Systems and Computers.

[55]  C. F. Beckmann,et al.  Tensorial extensions of independent component analysis for multisubject FMRI analysis , 2005, NeuroImage.

[56]  Aapo Hyvärinen,et al.  Independent component analysis of fMRI group studies by self-organizing clustering , 2005, NeuroImage.

[57]  Visa Koivunen,et al.  Identifiability, separability, and uniqueness of linear ICA models , 2004, IEEE Signal Processing Letters.

[58]  Michael I. Jordan,et al.  Kernel independent component analysis , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[59]  Lars Kai Hansen,et al.  An ICA algorithm for analyzing multiple data sets , 2002, Proceedings. International Conference on Image Processing.

[60]  Markus Svensén,et al.  ICA of fMRI Group Study Data , 2002, NeuroImage.

[61]  Allan Aasbjerg Nielsen,et al.  Multiset canonical correlations analysis and multispectral, truly multitemporal remote sensing data , 2002, IEEE Trans. Image Process..

[62]  J. Pekar,et al.  A method for making group inferences from functional MRI data using independent component analysis , 2001, Human brain mapping.

[63]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[64]  Aapo Hyvärinen,et al.  Fast and robust fixed-point algorithms for independent component analysis , 1999, IEEE Trans. Neural Networks.

[65]  J. Gentle Numerical Linear Algebra for Applications in Statistics , 1998 .

[66]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[67]  J. Cardoso,et al.  Blind beamforming for non-gaussian signals , 1993 .

[68]  I. Gessel,et al.  Binomial Determinants, Paths, and Hook Length Formulae , 1985 .

[69]  B. Lindström On the Vector Representations of Induced Matroids , 1973 .

[70]  Yangbo He,et al.  Identification of Linear Non-Gaussian Latent Hierarchical Structure , 2022, ICML.

[71]  N. Hansen,et al.  Identification of Partially Observed Linear Causal Models: Graphical Conditions for the Non-Gaussian and Heterogeneous Cases , 2021, NeurIPS.

[72]  Soumendu Sundar Mukherjee,et al.  Weak convergence and empirical processes , 2019 .

[73]  Tom M. Mitchell,et al.  Training fMRI Classifiers to Detect Cognitive States across Multiple Human Subjects , 2003, NIPS 2003.