Benchmarking a Simple Yet Effective Approach for Inferring Gene Regulatory Networks from Systems Genetic Data

We apply our recently proposed gene regulatory network (GRN) reconstruction framework for genetical genomics data to the StatSeq data. This method uses, in a first step, simple genotype–phenotype and phenotype–phenotype correlation measures to construct an initial GRN. This graph contains a high number of false positive edges that are reduced by (i) identifying eQTLs and by retaining only one candidate edge per eQTL, and (ii) by removing edges reflecting indirect effects by means of TRANSWESD, a transitive reduction approach. We discuss the general performance of our framework on the StatSeq in silico dataset by investigating the sensitivity of the two required threshold parameters and by analyzing the impact of certain network features (size, marker distance, and biological variance) on the reconstruction performance. Using selected examples, we also illustrate prominent sources of reconstruction errors. As expected, best results are obtained with large number of samples and larger marker distances. A less intuitive result is that significant (but not too large) biological variance can increase the reconstruction quality. Furthermore, a somewhat surprising finding was that the best performance (in terms of AUPR) could be found for networks of medium size (1,000 nodes), which we had expected to see for networks of small size (100 nodes).

[1]  L. Kruglyak,et al.  Genetics of global gene expression , 2006, Nature Reviews Genetics.

[2]  Robert J. Flassig,et al.  TRANSWESD: inferring cellular networks with transitive reduction , 2010, Bioinform..

[3]  N. Bing,et al.  Genetical Genomics Analysis of a Yeast Segregant Population for Transcription Network Inference , 2005, Genetics.

[4]  Jun Zhu,et al.  Increasing the Power to Detect Causal Associations by Combining Genotypic and Expression Data in Segregating Populations , 2007, PLoS Comput. Biol..

[5]  Ritsert C. Jansen,et al.  Studying complex biological systems using multifactorial perturbation , 2003, Nature Reviews Genetics.

[6]  A. G. de la Fuente,et al.  Gene Network Inference via Structural Equation Modeling in Genetical Genomics Experiments , 2008, Genetics.

[7]  Sandra Heise,et al.  An effective framework for reconstructing gene regulatory networks from genetical genomics data , 2013, Bioinform..

[8]  Yan Cui,et al.  Inferring gene transcriptional modulatory relations: a genetical genomics approach. , 2005, Human molecular genetics.

[9]  A. Beyer,et al.  Detection and interpretation of expression quantitative trait loci (eQTL). , 2009, Methods.

[10]  Jingyuan Fu,et al.  Regulatory network construction in Arabidopsis by using genome-wide gene expression quantitative trait loci , 2007, Proceedings of the National Academy of Sciences.

[11]  Ina Hoeschele,et al.  Inferring Gene Regulatory Networks from Genetical Genomics Data , 2010 .

[12]  J. Nap,et al.  Genetical genomics: the added value from segregation. , 2001, Trends in genetics : TIG.

[13]  M. Rockman,et al.  Reverse engineering the genotype–phenotype map with natural genetic variation , 2008, Nature.