The Landscape of Causal Inference: Perspective From Citation Network Analysis

ABSTRACT Causal inference is a fast-growing multidisciplinary field that has drawn extensive interests from statistical sciences and health and social sciences. In this article, we gather comprehensive information on publications and citations in causal inference and provide a review of the field from the perspective of citation network analysis. We provide descriptive analyses by showing the most cited publications, the most prolific and the most cited authors, and structural properties of the citation network. Then, we examine the citation network through exponential random graph models (ERGMs). We show that both technical aspects of the publications (e.g., publication length, time and quality) and social processes such as homophily (the tendency to cite publications in the same field or with shared authors), cumulative advantage, and transitivity (the tendency to cite references’ references), matter for citations. We also provide specific analysis of citations among the top authors in the field and present a ranking and clustering of the authors. Overall, our article reveals new insights into the landscape of the field of causal inference and may serve as a case study for analyzing citation networks in a multidisciplinary field and for fitting ERGMs on big networks. Supplementary materials for this article are available online.

[1]  Ying Ding,et al.  A bird's-eye view of scientific trading: Dependency relations among fields of science , 2012, J. Informetrics.

[2]  Garry Robins,et al.  Closure , connectivity and degrees : New specifications for exponential random graph ( p * ) models for directed social networks , 2006 .

[3]  J. S. Long,et al.  Cumulative Advantage and Inequality in Science , 1982 .

[4]  David Firth,et al.  Statistical modelling of citation exchange between statistics journals , 2013, Journal of the Royal Statistical Society. Series A,.

[5]  G. Imbens,et al.  Matching on the Estimated Propensity Score , 2009 .

[6]  Gary King,et al.  Misunderstandings between experimentalists and observationalists about causal inference , 2008 .

[7]  M. Newman,et al.  The structure of scientific collaboration networks. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[8]  J. Moody The Structure of a Social Science Collaboration Network: Disciplinary Cohesion from 1963 to 1999 , 2004 .

[9]  J. Robins,et al.  Marginal Structural Models and Causal Inference in Epidemiology , 2000, Epidemiology.

[10]  Sergei Maslov,et al.  Ranking scientific publications using a model of network traffic , 2006, ArXiv.

[11]  Joshua D. Angrist,et al.  Identification of Causal Effects Using Instrumental Variables , 1993 .

[12]  Dylan S. Small,et al.  War and Wages : The Strength of Instrumental Variables and Their Sensitivity to Unobserved Biases , 2007 .

[13]  Mark S. Handcock,et al.  A framework for the comparison of maximum pseudo-likelihood and maximum likelihood estimation of exponential family random graph models , 2009, Soc. Networks.

[14]  J. Robins,et al.  Instruments for Causal Inference: An Epidemiologist's Dream? , 2006, Epidemiology.

[15]  Joshua D. Angrist,et al.  Mostly Harmless Econometrics: An Empiricist's Companion , 2008 .

[16]  Dylan S. Small,et al.  Instrumental Variable Estimation When Compliance is not Deterministic: The Stochastic Monotonicity Assumption , 2014, 1407.7308.

[17]  Judea Pearl,et al.  Causal Inference , 2010 .

[18]  G. Carroll,et al.  The Liability of Newness: Age Dependence in Organizational Death Rates , 1983 .

[19]  G. Shaw,et al.  Maternal pesticide exposure from multiple sources and selected congenital anomalies. , 1999 .

[20]  S. Stigler Citation Patterns in the Journals of Statistics and Probability , 1994 .

[21]  Carter T. Butts,et al.  Social Network Analysis with sna , 2008 .

[22]  Alberto Abadie Semiparametric instrumental variable estimation of treatment response models , 2003 .

[23]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[24]  Aaron Panofsky,et al.  Field Analysis and Interdisciplinary Science: Scientific Capital Exchange in Behavior Genetics , 2011 .

[25]  Jiashun Jin,et al.  Coauthorship and Citation Networks for Statisticians , 2014, ArXiv.

[26]  Matthew D. Lieberman,et al.  Birds of a feather , 1994, Nature Structural Biology.

[27]  Martina Morris,et al.  statnet: Software Tools for the Representation, Visualization, Analysis and Simulation of Network Data. , 2008, Journal of statistical software.

[28]  D. Basu Randomization Analysis of Experimental Data: The Fisher Randomization Test , 1980 .

[29]  J. Pearl,et al.  Causal inference , 2011, Twenty-one Mental Models That Can Change Policing.

[30]  Gary King,et al.  Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference , 2007, Political Analysis.

[31]  Jacob G. Foster,et al.  Weaving the fabric of science: Dynamic network models of science's unfolding structure , 2015, Soc. Networks.

[32]  M. McPherson,et al.  BIRDS OF A FEATHER: Homophily , 2001 .

[33]  Weihua An,et al.  LARF: Instrumental Variable Estimation of Causal Effects through Local Average Response Functions , 2016 .

[34]  Elizabeth L. Ogburn,et al.  Causal diagrams for interference , 2014, 1403.1239.

[35]  Weihua An,et al.  4. Bayesian Propensity Score Estimators: Incorporating Uncertainties in Propensity Scores into Causal Inference , 2010 .

[36]  Carter T. Butts,et al.  4. A Relational Event Framework for Social Action , 2008 .

[37]  Pierre Bourdieu,et al.  Science of Science and Reflexivity , 2004 .

[38]  Peng Ding,et al.  Randomization inference for treatment effect variation , 2014, 1412.5000.

[39]  James M. Robins,et al.  DOUBLY ROBUST INSTRUMENTAL VARIABLE REGRESSION , 2012 .

[40]  Martina Morris,et al.  A statnet Tutorial. , 2008, Journal of statistical software.

[41]  J. S. Long,et al.  DEPARTMENTAL EFFECTS ON SCIENTIFIC PRODUCTIVITY , 1990 .

[42]  Felix Elwert,et al.  Graphical Causal Models , 2013 .

[43]  Kosuke Imai,et al.  Causal Inference With General Treatment Regimes , 2004 .

[44]  M. Baiocchi,et al.  Instrumental variable methods for causal inference , 2014, Statistics in medicine.

[45]  G. Imbens The Role of the Propensity Score in Estimating Dose-Response Functions , 1999 .

[46]  E. Rogers,et al.  Diffusion of innovations , 1964, Encyclopedia of Sport Management.

[47]  Dylan S. Small,et al.  Randomization Inference in a Group–Randomized Trial of Treatments for Depression , 2008 .

[48]  Garry Robins,et al.  An introduction to exponential random graph (p*) models for social networks , 2007, Soc. Networks.

[49]  Pierre Bourdieu,et al.  The peculiar history of scientific reason , 1991 .

[50]  P. Bourdieu,et al.  实践与反思 : 反思社会学导引 = An invitation to reflexive sociology , 1994 .

[51]  Martina Morris,et al.  ergm: A Package to Fit, Simulate and Diagnose Exponential-Family Models for Networks. , 2008, Journal of statistical software.

[52]  Dylan S. Small,et al.  War and Wages , 2008 .

[53]  Anthony A. Braga,et al.  The Corner and the Crew: The Influence of Geography and Social Networks on Gang Violence , 2013 .

[54]  Bernard Barber,et al.  The Structure of Scientific Revolutions. , 1963 .

[55]  D. Hunter,et al.  Inference in Curved Exponential Family Models for Networks , 2006 .

[56]  Scott R. Eliason,et al.  A History of Causal Analysis in the Social Sciences , 2013 .

[57]  Garry Robins,et al.  Exponential random graph models for social networks: theories, methods and applications , 2012 .

[58]  Peng Wang,et al.  Recent developments in exponential random graph (p*) models for social networks , 2007, Soc. Networks.

[59]  Elizabeth A Stuart,et al.  Matching methods for causal inference: A review and a look forward. , 2010, Statistical science : a review journal of the Institute of Mathematical Statistics.

[60]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[61]  A. Young Mostly Harmless Econometrics , 2012 .

[62]  P. Allison,et al.  Productivity Differences Among Scientists: Evidence for Accumulative Advantage , 1974 .

[63]  Patrick Royston Flexible alternatives to the Cox model, and more , 2001 .

[64]  D. Rubin,et al.  Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction , 2016 .

[65]  S. Morgan Handbook of Causal Analysis for Social Research , 2013 .

[66]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[67]  G. Imbens,et al.  Bias-Corrected Matching Estimators for Average Treatment Effects , 2002 .

[68]  Dylan S. Small,et al.  Sensitivity Analysis for Instrumental Variables Regression With Overidentifying Restrictions , 2007 .

[69]  Stephen R Cole,et al.  Constructing inverse probability weights for marginal structural models. , 2008, American journal of epidemiology.

[70]  Alan Poulter,et al.  Encyclopedia of Library and Information Science , 2003 .

[71]  David R. Hunter,et al.  Curved exponential family models for social networks , 2007, Soc. Networks.

[72]  P. Bourdieu The specificity of the scientific field and the social conditions of the progress of reason , 1975 .

[73]  T. Gieryn Boundary-work and the demarcation of science from non-science: Strains and interests in professional , 1983 .

[74]  Weihua An Instrumental variables estimates of peer effects in social networks. , 2015, Social science research.

[75]  G. Imbens,et al.  Implementing Matching Estimators for Average Treatment Effects in Stata , 2004 .

[76]  J. Robins,et al.  Instrumental variables as bias amplifiers with general outcome and confounding , 2017, Biometrika.

[77]  Sharon L. Milgram,et al.  The Small World Problem , 1967 .

[78]  Tai-Quan Peng,et al.  Assortative mixing, preferential attachment, and triadic closure: A longitudinal study of tie-generative mechanisms in journal citation networks , 2015, J. Informetrics.

[79]  Peng Wang,et al.  Closure, connectivity and degree distributions: Exponential random graph (p*) models for directed social networks , 2009, Soc. Networks.

[80]  Ying Ding,et al.  Applying weighted PageRank to author citation networks , 2011, J. Assoc. Inf. Sci. Technol..

[81]  G. Imbens,et al.  Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score , 2000 .

[82]  S. Wasserman,et al.  Logit models and logistic regressions for social networks: I. An introduction to Markov graphs andp , 1996 .

[83]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[84]  A. Zaslavsky,et al.  Estimating Peer Effects in Longitudinal Dyadic Data Using Instrumental Variables , 2014, Biometrics.

[85]  Norman Kaplan,et al.  The Sociology of Science: Theoretical and Empirical Investigations , 1974 .

[86]  J. Scott Long,et al.  The Problem of Junior-Authored Papers in Constructing Citation Counts , 1980 .

[87]  Sarah A. Mustillo,et al.  Recent development of propensity score methods in observational studies: Multi-categorical treatment, causal mediation, and heterogeneity , 2016 .

[88]  D. Watts Networks, Dynamics, and the Small‐World Phenomenon1 , 1999, American Journal of Sociology.

[89]  Kevin W. Boyack,et al.  Approaches to understanding and measuring interdisciplinary scientific research (IDR): A review of the literature , 2011, J. Informetrics.

[90]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[91]  J. Pearl,et al.  Causal diagrams for epidemiologic research. , 1999, Epidemiology.

[92]  J. Angrist,et al.  Identification and Estimation of Local Average Treatment Effects , 1995 .

[93]  R. Merton The Matthew Effect in Science , 1968, Science.

[94]  G. Imbens,et al.  Large Sample Properties of Matching Estimators for Average Treatment Effects , 2004 .

[95]  P. Allison Inequality and Scientific Productivity , 1980 .