Inferring causal relationships among intermediate phenotypes and biomarkers: a case study of rheumatoid arthritis

MOTIVATION Genetic association analysis is based on statistical correlations which do not assign any cause-to-effect arrows between the two correlated variables. Normally, such assignment of cause and effect label is not necessary in genetic analysis since genes are always the cause and phenotypes are always the effect. However, among intermediate phenotypes and biomarkers, assigning cause and effect becomes meaningful, and causal inference can be useful. RESULTS We show that causal inference is possible by an example in a study of rheumatoid arthritis. With the help of genotypic information, the shared epitope, the causal relationship between two biomarkers related to the disease, anti-cyclic citrullinated peptide (anti-CCP) and rheumatoid factor (RF) has been established. We emphasize the fact that third variable must be a genotype to be able to resolve potential ambiguities in causal inference. Two non-trivial conclusions have been reached by the causal inference: (1) anti-CCP is a cause of RF and (2) it is unlikely that a third confounding factor contributes to both anti-CCP and RF.

[1]  Bill Shipley,et al.  Cause and Correlation in Biology: A User''s Guide to Path Analysis , 2016 .

[2]  Elizabeth W Karlson,et al.  Replication of putative candidate-gene associations with rheumatoid arthritis in >4,000 samples from North America and Sweden: association of susceptibility with PTPN22, CTLA4, and PADI4. , 2005, American journal of human genetics.

[3]  Wentian Li,et al.  Regulation of anti-cyclic citrullinated peptide antibodies in rheumatoid arthritis: contrasting effects of HLA-DR3 and the shared epitope alleles. , 2005, Arthritis and rheumatism.

[4]  Mark J. van der Laan,et al.  A causal inference approach for constructing transcriptional regulatory networks , 2005, Bioinform..

[5]  Andrei S. Rodin,et al.  Mining genetic epidemiology data with Bayesian networks I: Bayesian networks and example application (plasma apoE levels) , 2005, Bioinform..

[6]  J. Castle,et al.  An integrative genomics approach to infer causal associations between gene expression and disease , 2005, Nature Genetics.

[7]  S. Rantapää-Dahlqvist Diagnostic and prognostic significance of autoantibodies in early rheumatoid arthritis , 2005, Scandinavian journal of rheumatology.

[8]  Annette Lee,et al.  The PTPN22 R620W polymorphism associates with RF positive rheumatoid arthritis in a dose-dependent manner but not with HLA-SE status , 2005, Genes and Immunity.

[9]  Stephen D. Bay,et al.  Temporal Aggregation Bias and Inference of Causal Regulatory Networks , 2004, J. Comput. Biol..

[10]  E. Vossenaar,et al.  Autoantibodies to Citrullinated (Poly)Peptides: A Key Diagnostic and Prognostic Marker for Rheumatoid Arthritis , 2004, Autoimmunity.

[11]  B. Dijkmans,et al.  Specific autoantibodies precede the symptoms of rheumatoid arthritis: a study of serial measurements in blood donors. , 2004, Arthritis and rheumatism.

[12]  B. Keavney Commentary: Katan's remarkable foresight: genes and causality 18 years on. , 2004, International journal of epidemiology.

[13]  M. B. Katan,et al.  Apolipoprotein E isoforms, serum cholesterol, and cancer , 2004 .

[14]  David V Conti,et al.  Commentary: the concept of 'Mendelian Randomization'. , 2004, International journal of epidemiology.

[15]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[16]  Richard Scheines,et al.  A Statistical Problem for Inference to Regulatory Structure from Associations of Gene Expression Measurements with Microarrays , 2003, Bioinform..

[17]  S. Ebrahim,et al.  'Mendelian randomization': can genetic epidemiology contribute to understanding environmental determinants of disease? , 2003, International journal of epidemiology.

[18]  Wei Chen,et al.  Dissecting the genetic complexity of the association between human leukocyte antigens and rheumatoid arthritis. , 2002, American journal of human genetics.

[19]  Gregory F. Cooper,et al.  Discovery of Causal Relationships in a Gene-Regulation Pathway from a Mixture of Experimental and Observational DNA Microarray Data , 2001, Pacific Symposium on Biocomputing.

[20]  David Clayton,et al.  Epidemiological methods for studying genes and environmental factors in complex diseases , 2001, The Lancet.

[21]  K. Hoover Causality in Macroeconomics , 2001 .

[22]  J. Robins,et al.  Marginal Structural Models and Causal Inference in Epidemiology , 2000, Epidemiology.

[23]  J. Robins,et al.  Marginal structural models to estimate the causal effect of zidovudine on the survival of HIV-positive men. , 2000, Epidemiology.

[24]  Gen Tamiya,et al.  Complete sequence and gene map of a human major histocompatibility complex , 1999 .

[25]  Rajeev Motwani,et al.  Scalable Techniques for Mining Causal Structures , 1998, Data Mining and Knowledge Discovery.

[26]  P. Gregersen,et al.  The North American Rheumatoid Arthritis Consortium--bringing genetic analysis to bear on disease susceptibility, severity, and outcome. , 1998, Arthritis care and research : the official journal of the Arthritis Health Professions Association.

[27]  Gregory F. Cooper,et al.  A Simple Constraint-Based Algorithm for Efficiently Mining Observational Databases for Causal Relationships , 1997, Data Mining and Knowledge Discovery.

[28]  M. Halloran,et al.  Causal Inference in Infectious Diseases , 1995, Epidemiology.

[29]  W. Thomson,et al.  Population genetics of rheumatoid arthritis. , 1992, Rheumatic diseases clinics of North America.

[30]  T. Tuomi,et al.  Rheumatoid factors antedating clinical rheumatoid arthritis. , 1991, The Journal of rheumatology.

[31]  Wentian Li Mutual information functions versus correlation functions , 1990 .

[32]  P. Gregersen,et al.  The shared epitope hypothesis. An approach to understanding the molecular genetics of susceptibility to rheumatoid arthritis. , 1987, Arthritis and rheumatism.

[33]  F. Shann CHLORAMPHENICOL FOR MENINGITIS AND PNEUMONIA , 1986, The Lancet.

[34]  P. Holland Statistics and Causal Inference , 1985 .

[35]  D. A. Kenny,et al.  Correlation and Causation , 1937, Wilmott.

[36]  A. Dawid Conditional Independence for Statistical Operations , 1980 .

[37]  A. Dawid Conditional Independence in Statistical Theory , 1979 .

[38]  Richard M. Pope,et al.  IgG rheumatoid factor , 1979 .

[39]  P. Stastny Association of the B-cell alloantigen DRw4 with rheumatoid arthritis. , 1978, The New England journal of medicine.

[40]  J S Koopman,et al.  Causal models and sources of interaction. , 1977, American journal of epidemiology.

[41]  D. Rubin Estimating causal effects of treatments in randomized and nonrandomized studies. , 1974 .

[42]  Tm Brouwers,et al.  VIRUS-INFECTION OF URINARY-TRACT , 1974 .

[43]  C. Granger Investigating causal relations by econometric models and cross-spectral methods , 1969 .

[44]  H. Niles Correlation, Causation and Wright's Theory of "Path Coefficients". , 1922, Genetics.

[45]  J. Hazes,et al.  The diagnostic properties of rheumatoid arthritis antibodies recognizing a cyclic citrullinated peptide. , 2000, Arthritis and rheumatism.

[46]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[47]  B. Shipley Cause and correlation in biology , 2000 .

[48]  T. Sumida,et al.  Apoptosis in rheumatoid arthritis: a novel pathway in the regulation of synovial tissue. , 1998, Arthritis and rheumatism.

[49]  Gail ter Haar,et al.  A personal viewpoint , 1985 .

[50]  C. Granger Testing for causality: a personal viewpoint , 1980 .

[51]  R. Pope,et al.  IgG rheumatoid factor. Relationship to seropositive rheumatoid arthritis and absence in seronegative disorders. , 1979, Arthritis and rheumatism.