Dynamic Gene Coexpression Analysis with Correlation Modeling

In many transcriptomic studies, the correlation of genes might fluctuate with quantitative factors such as genetic ancestry. We propose a method that models the covariance between two variables to vary against a continuous covariate. For the bivariate case, the proposed score test statistic is computationally simple and robust to model misspecification of the covariance term. Subsequently, the method is expanded to test relationships between one highly connected gene, such as a transcription factor, and several other genes for a more global investigation of the dynamic of the coexpression network. Simulations show that the proposed method has higher statistical power than alternatives, can be used in more diverse scenarios, and is computationally cheaper. We apply this method to African American subjects from GTEx to analyze the dynamic behavior of their gene coexpression against genetic ancestry and to identify transcription factors whose coexpression with their target genes change with the genetic ancestry. The proposed method can be applied to a wide array of problems that require covariance modeling.

[1]  A. G. de la Fuente From 'differential expression' to 'differential networking' - identification of dysfunctional regulatory networks in diseases. , 2010, Trends in genetics : TIG.

[2]  Leslie Godfrey,et al.  Testing for multiplicative heteroskedasticity , 1978 .

[3]  Monotonic improved critical values for two χ2 asymptotic criteria , 2001 .

[4]  Martin Kuiper,et al.  TFcheckpoint: a curated compendium of specific DNA-binding RNA polymerase II transcription factors , 2013, Bioinform..

[5]  Yuzo Honda,et al.  A size correction to the Lagrange multiplier test for heteroskedasticity , 1988 .

[6]  Takeshi Amemiya,et al.  A note on a heteroscedastic model , 1977 .

[7]  Francisco Cribari-Neto,et al.  An improved lagrange multiplier test for heteroskedasticity , 1995 .

[8]  Tianwei Yu,et al.  A new dynamic correlation algorithm reveals novel functional aspects in single cell and bulk RNA-seq data , 2018, PLoS Comput. Biol..

[9]  Ker-Chau Li,et al.  A system for enhancing genome-wide coexpression dynamics study. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[10]  H. White A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity , 1980 .

[11]  Li Hsu,et al.  An exponential combination procedure for set-based association tests in sequencing studies. , 2012, American journal of human genetics.

[12]  P. Harris,et al.  An asymptotic expansion for the null distribution of the efficient score statistic , 1985 .

[13]  T. Breurch,et al.  A simple test for heteroscedasticity and random coefficient variation (econometrica vol 47 , 1979 .

[14]  P. Donnelly,et al.  Inference of population structure using multilocus genotype data. , 2000, Genetics.

[15]  N. L. Johnson,et al.  Linear Statistical Inference and Its Applications , 1966 .

[16]  Ion I. Mandoiu,et al.  Imputation-Based Local Ancestry Inference in Admixed Populations , 2009, ISBRA.

[17]  D. Reich,et al.  Effects of cis and trans Genetic Ancestry on Gene Expression in African Americans , 2008, PLoS genetics.

[18]  Noah Zaitlen,et al.  Differential methylation between ethnic sub-groups reflects the effect of genetic ancestry and environmental exposures , 2016, bioRxiv.

[19]  H. Glejser A New Test for Heteroskedasticity , 1969 .

[20]  P. Moschopoulos,et al.  The distribution of the sum of independent gamma random variables , 1985 .

[21]  Ellen T. Gelfand,et al.  The Genotype-Tissue Expression (GTEx) project , 2013, Nature Genetics.

[22]  Ker-Chau Li,et al.  Genome-wide coexpression dynamics: Theory and application , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[23]  W. Griffiths,et al.  A MONTE CARLO EVALUATION OF THE POWER OF SOME TESTS FOR HETEROSCEDASTICITY , 1986 .

[24]  Yan Yan,et al.  Detecting subnetwork-level dynamic correlations , 2017, Bioinform..

[25]  T. Ideker,et al.  Differential network biology , 2012, Molecular systems biology.

[26]  Andras Fiser,et al.  Prediction of DNA binding motifs from 3D models of transcription factors; identifying TLX3 regulated genes , 2014, Nucleic acids research.