Lingam: Non-Gaussian Methods for Estimating Causal Structures

In many empirical sciences, the causal mechanisms underlying various phenomena need to be studied. Structural equation modeling is a general framework used for multivariate analysis, and provides a powerful method for studying causal mechanisms. However, in many cases, classical structural equation modeling is not capable of estimating the causal directions of variables. This is because it explicitly or implicitly assumes Gaussianity of data and typically utilizes only the covariance structure of data. In many applications, however, non-Gaussian data are often obtained, which means that more information may be contained in the data distribution than the covariance matrix is capable of containing. Thus, many new methods have recently been proposed for utilizing the non-Gaussian structure of data and estimating the causal directions of variables. In this paper, we provide an overview of such recent developments in causal inference, and focus in particular on the non-Gaussian methods known as LiNGAM.

[1]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[2]  Bernhard Schölkopf,et al.  Nonlinear causal discovery with additive noise models , 2008, NIPS.

[3]  Yusuke Komatsu,et al.  Assessing Statistical Reliability of LiNGAM via Multiscale Bootstrap , 2010, ICANN.

[4]  Patrik O. Hoyer,et al.  Estimation of causal effects using linear non-Gaussian causal models with hidden variables , 2008, Int. J. Approx. Reason..

[5]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[6]  Judea Pearl,et al.  Complete Identification Methods for the Causal Hierarchy , 2008, J. Mach. Learn. Res..

[7]  Takashi Washio,et al.  Estimation of causal structures in longitudinal data using non-Gaussianity , 2013, 2013 IEEE International Workshop on Machine Learning for Signal Processing (MLSP).

[8]  Yoshinobu Kawahara,et al.  Analyzing relationships among ARMA processes based on non-Gaussianity of external influences , 2011, Neurocomputing.

[9]  Patrik O. Hoyer,et al.  Estimating a Causal Order among Groups of Variables in Linear Models , 2012, ICANN.

[10]  D. A. Kenny,et al.  Correlation and Causation , 1937, Wilmott.

[11]  Michael I. Jordan,et al.  Kernel independent component analysis , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[12]  Judea Pearl,et al.  A Theory of Inferred Causation , 1991, KR.

[13]  Aapo Hyvärinen,et al.  Discovery of Linear Non-Gaussian Acyclic Models in the Presence of Latent Classes , 2007, ICONIP.

[14]  Patrik O. Hoyer,et al.  Discovering Unconfounded Causal Relationships Using Linear Non-Gaussian Models , 2010, JSAI-isAI Workshops.

[15]  P. Chomczyński,et al.  RNAzol ® RT: a new single-step method for isolation of RNA , 2010 .

[16]  Patrik O. Hoyer,et al.  Bayesian Discovery of Linear Acyclic Causal Models , 2009, UAI.

[17]  Thomas S. Richardson,et al.  Causal Inference in the Presence of Latent Variables and Selection Bias , 1995, UAI.

[18]  P. Holland Statistics and Causal Inference , 1985 .

[19]  Peter Bühlmann,et al.  Causal statistical inference in high dimensions , 2013, Math. Methods Oper. Res..

[20]  Aapo Hyvärinen,et al.  DirectLiNGAM: A Direct Method for Learning a Linear Non-Gaussian Structural Equation Model , 2011, J. Mach. Learn. Res..

[21]  Robert W. Batterman,et al.  On the Explanatory Role of Mathematics in Empirical Science , 2010, The British Journal for the Philosophy of Science.

[22]  Benjamin E Dunmore,et al.  Gene network inference and visualization tools for biologists: application to new human transcriptome datasets , 2011, Nucleic acids research.

[23]  Thomas S. Richardson,et al.  A Polynomial-Time Algorithm for Deciding Markov Equivalence of Directed Cyclic Graphical Models , 1996, UAI 1996.

[24]  B. Roberts,et al.  Can low Behavioral Activation System predict depressive mood?: An application of non‐normal structural equation modeling , 2012 .

[25]  K. Bollen,et al.  Bayesian estimation of possible causal direction in the presence of latent confounders using a linear non-Gaussian acyclic structural equation model with individual-specific effects , 2013, 1310.6778.

[26]  Yoshinobu Kawahara,et al.  GroupLiNGAM: Linear non-Gaussian acyclic models for sets of variables , 2010, ArXiv.

[27]  Aapo Hyvärinen,et al.  Estimation of linear non-Gaussian acyclic models for latent factors , 2009, Neurocomputing.

[28]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[29]  P. Spirtes,et al.  An Algorithm for Fast Recovery of Sparse Causal Graphs , 1991 .

[30]  Patrik O. Hoyer,et al.  Discovering Cyclic Causal Models by Independent Components Analysis , 2008, UAI.

[31]  Bernhard Schölkopf,et al.  Identifiability of Causal Graphs using Functional Models , 2011, UAI.

[32]  Andreas Ritter,et al.  Structural Equations With Latent Variables , 2016 .

[33]  Aapo Hyvärinen,et al.  Pairwise likelihood ratios for estimation of non-Gaussian structural equation models , 2013, J. Mach. Learn. Res..

[34]  Bernhard Schölkopf,et al.  Invariant Gaussian Process Latent Variable Models and Application in Causal Discovery , 2010, UAI.

[35]  Aapo Hyvärinen,et al.  New Approximations of Differential Entropy for Independent Component Analysis and Projection Pursuit , 1997, NIPS.

[36]  Arthur Gretton,et al.  Nonlinear directed acyclic structure learning with weakly additive noise models , 2009, NIPS.

[37]  Takashi Washio,et al.  Bootstrap Confidence Intervals in DirectLiNGAM , 2012, 2012 IEEE 12th International Conference on Data Mining Workshops.

[38]  Koken Ozaki,et al.  Direction of Causation Between Shared and Non-Shared Environmental Factors , 2009, Behavior genetics.

[39]  Aapo Hyvärinen,et al.  Structural equations and divisive normalization for energy-dependent component analysis , 2011, NIPS.

[40]  Bernhard Schölkopf,et al.  Regression by dependence minimization and its application to causal inference in additive noise models , 2009, ICML '09.

[41]  Aapo Hyvärinen,et al.  Validating the independent components of neuroimaging time series via clustering and visualization , 2004, NeuroImage.

[42]  Clark Glymour,et al.  Multi-subject search correctly identifies causal connections and most causal directions in the DCM models of the Smith et al. simulation study , 2011, NeuroImage.

[43]  Peter Bühlmann,et al.  CAM: Causal Additive Models, high-dimensional order search and penalized regression , 2013, ArXiv.

[44]  Bernhard Schölkopf,et al.  Causal Inference on Time Series using Restricted Structural Equation Models , 2013, NIPS.

[45]  C. Kishtawal,et al.  Observational evidence that agricultural intensification and land use change may be reducing the Indian summer monsoon rainfall , 2010 .

[46]  Bernhard Schölkopf,et al.  Causal Inference on Discrete Data Using Additive Noise Models , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Lai-Wan Chan,et al.  ICA with Sparse Connections , 2006, IDEAL.

[48]  Peter Spirtes,et al.  When causality matters for prediction: investigating the practical tradeoffs , 2008 .

[49]  Shohei Shimizu,et al.  Joint estimation of linear non-Gaussian acyclic models , 2011, Neurocomputing.

[50]  Peter Bühlmann,et al.  Predicting causal effects in large-scale systems from observational data , 2010, Nature Methods.

[51]  Oyer,et al.  Causal Inference by Independent Component Analysis: Theory and Applications∗ , 2012 .

[52]  David S. Moore,et al.  Undergraduate Programs and the Future of Academic Statistics , 2001 .

[53]  D. Pe’er,et al.  Principles and Strategies for Developing Network Models in Cancer , 2011, Cell.

[54]  Shohei Shimizu,et al.  Use of non-normality in structural equation modeling: Application to direction of causation , 2008 .

[55]  D. Rubin Estimating causal effects of treatments in randomized and nonrandomized studies. , 1974 .

[56]  Y. Dodge,et al.  On Asymmetric Properties of the Correlation Coeffcient in the Regression Setting , 2001 .

[57]  Bernhard Schölkopf,et al.  Causal Inference on Time Series using Structural Equation Models , 2012, ArXiv.

[58]  Stephen M. Smith,et al.  The future of FMRI connectivity , 2012, NeuroImage.

[59]  Bernhard Schölkopf,et al.  On causal and anticausal learning , 2012, ICML.

[60]  Aapo Hyvärinen,et al.  ParceLiNGAM: A Causal Ordering Method Robust Against Latent Confounders , 2013, Neural Computation.

[61]  A. Alexandrova The British Journal for the Philosophy of Science , 1965, Nature.

[62]  Egil Ferkingstad,et al.  Causal modeling and inference for electricity markets , 2011, 1110.5429.

[63]  Aapo Hyvärinen,et al.  Estimating exogenous variables in data with more variables than observations , 2011, Neural Networks.

[64]  Bernhard Schölkopf,et al.  Measuring Statistical Dependence with Hilbert-Schmidt Norms , 2005, ALT.

[65]  Visa Koivunen,et al.  Identifiability, separability, and uniqueness of linear ICA models , 2004, IEEE Signal Processing Letters.

[66]  Aapo Hyvärinen,et al.  Causal discovery of linear acyclic models with arbitrary distributions , 2008, UAI.

[67]  Hideki Toyoda,et al.  Using Non-Normal SEM to Resolve the ACDE Model in the Classical Twin Design , 2010, Behavior genetics.

[68]  P. Hoyer,et al.  On Causal Discovery from Time Series Data using FCI , 2010 .

[69]  D. A. Kenny,et al.  Correlation and Causation. , 1982 .

[70]  Aapo Hyvärinen,et al.  Independent component analysis: recent advances , 2013, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[71]  Stefano Bromuri,et al.  Multi-Dimensional Causal Discovery , 2013, IJCAI.

[72]  Bernhard Schölkopf,et al.  On Causal Discovery with Cyclic Additive Noise Models , 2011, NIPS.

[73]  Judea Pearl,et al.  Identification of Joint Interventional Distributions in Recursive Semi-Markovian Causal Models , 2006, AAAI.

[74]  E. Lukács,et al.  A Property of the Normal Distribution , 1954 .

[75]  Aapo Hyvärinen,et al.  Causality Discovery with Additive Disturbances: An Information-Theoretical Perspective , 2009, ECML/PKDD.

[76]  A. Kraskov,et al.  Estimating mutual information. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[77]  Ole Winther,et al.  Sparse Linear Identifiable Multivariate Modeling , 2010, J. Mach. Learn. Res..

[78]  Christian Jutten,et al.  Blind separation of sources, part I: An adaptive algorithm based on neuromimetic architecture , 1991, Signal Process..

[79]  J. Viikari,et al.  Pairwise Measures of Causal Direction in the Epidemiology of Sleep Problems and Depression , 2012, PloS one.

[80]  Mark W. Woolrich,et al.  Network modelling methods for FMRI , 2011, NeuroImage.

[81]  Satoru Miyano,et al.  Bayesian Network and Nonparametric Heteroscedastic Regression for Nonlinear Modeling of Genetic Network , 2003, J. Bioinform. Comput. Biol..

[82]  Aapo Hyvärinen,et al.  On the Identifiability of the Post-Nonlinear Causal Model , 2009, UAI.

[83]  Harold W. Kuhn,et al.  The Hungarian method for the assignment problem , 1955, 50 Years of Integer Programming.

[84]  Mikael Henaff,et al.  New methods for separating causes from effects in genomics data , 2012, BMC Genomics.

[85]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[86]  C. Glymour What Is Right with ‘Bayes Net Methods’ and What Is Wrong with ‘Hunting Causes and Using Them’? , 2010, The British Journal for the Philosophy of Science.

[87]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[88]  Ruichu Cai,et al.  SADA: A General Framework to Support Robust Causation Discovery , 2013, ICML.

[89]  Aapo Hyvärinen,et al.  Fast and robust fixed-point algorithms for independent component analysis , 1999, IEEE Trans. Neural Networks.

[90]  T. Micceri The unicorn, the normal curve, and other improbable creatures. , 1989 .

[91]  G. Darmois,et al.  Analyse générale des liaisons stochastiques: etude particulière de l'analyse factorielle linéaire , 1953 .

[92]  Shun-ichi Amari,et al.  Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.

[93]  David Maxwell Chickering,et al.  Optimal Structure Identification With Greedy Search , 2002, J. Mach. Learn. Res..

[94]  Zhitang Chen,et al.  Causality in Linear Nongaussian Acyclic Models in the Presence of Latent Gaussian Confounders , 2013, Neural Computation.

[95]  Norman R. Swanson,et al.  Impulse Response Functions Based on a Causal Approach to Residual Orthogonalization in Vector Autoregressions , 1997 .

[96]  Aapo Hyvärinen,et al.  A Linear Non-Gaussian Acyclic Model for Causal Discovery , 2006, J. Mach. Learn. Res..

[97]  P. Bentler Some contributions to efficient statistics in structural models: Specification and estimation of moment structures , 1983 .

[98]  Aapo Hyvärinen,et al.  Estimation of a Structural Vector Autoregression Model Using Non-Gaussianity , 2010, J. Mach. Learn. Res..

[99]  Terrence J. Sejnowski,et al.  Learning Overcomplete Representations , 2000, Neural Computation.

[100]  J. Pearl Causal diagrams for empirical researchRejoinder to Discussions of ‘Causal diagrams for empirical research’ , 1995 .

[101]  Seungjin Choi,et al.  Independent Component Analysis , 2009, Handbook of Natural Computing.

[102]  J. Pearl Causal diagrams for empirical research , 1995 .