Genetic network reverse-engineering and network size; can we identify large GRNs?

The reverse engineering of genetic regulatory networks (GRNs) is a highly challenging optimisation problem, surrounded by many unresolved questions concerning the extent to which we can regard a reverse-engineered GRN to reflect the target GRN, which we call the fidelity of the reverse engineered GRN. Related questions concern the ability of reverse-engineering algorithms to find networks that fit the data under consideration, that is, their accuracy. Most research works with networks two orders of magnitude smaller than those of biological interest, and the following question is consequently unexplored: how can we expect fidelity and accuracy to vary with network size? Answers to this question will reveal whether or not we can reliably extrapolate, to large networks, results obtained on the ability of reverse-engineering methods on small networks. We use real-world data to explore accuracy and fidelity of a simple GRN reverse-engineering approach, over sizes of networks varying from 100 to 6,000. We find that accurate networks can be found with ease at any size. However, the diversity of accurate reverse-engineered GRNs increases sharply between 100 and around 2,000 genes, then settling down to a maximal level, indicating that the fidelity of reverse-engineered networks is likely to decrease sharply with size.

[1]  Hesham H. Ali,et al.  A computational approach to reconstructing gene regulatory networks , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[2]  V. Anne Smith,et al.  Evaluating functional network inference using simulations of complex biological systems , 2002, ISMB.

[3]  H. Iba,et al.  Inferring a system of differential equations for a gene regulatory network by using genetic programming , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[4]  J. Claverie Computational methods for the identification of differential and coordinated gene expression. , 1999, Human molecular genetics.

[5]  S Fuhrman,et al.  Reveal, a general reverse engineering algorithm for inference of genetic network architectures. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[6]  A. Brazma,et al.  Towards reconstruction of gene networks from expression data by supervised learning , 2003, Genome Biology.

[7]  Satoru Miyano,et al.  Finding Optimal Models for Small Gene Networks , 2003 .

[8]  Ting Chen,et al.  Modeling Gene Expression with Differential Equations , 1998, Pacific Symposium on Biocomputing.

[9]  Patrik D'haeseleer,et al.  Genetic network inference: from co-expression clustering to reverse engineering , 2000, Bioinform..

[10]  E. Lander,et al.  Remodeling of yeast genome expression in response to environmental changes. , 2001, Molecular biology of the cell.

[11]  Farren J. Isaacs,et al.  Computational studies of gene regulatory networks: in numero molecular biology , 2001, Nature Reviews Genetics.

[12]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[13]  Hitoshi Iba,et al.  Evolutionary Inference of a Biological Network as Differential Equations by Genetic Programming , 2001 .

[14]  Francis J. Doyle,et al.  Simulation Studies for the Identification of Genetic Networks from cDNA Array and Regulatory Activity Data , 2001 .

[15]  David W. Corne,et al.  Investigating issues in the reconstructability of genetic regulatory networks , 2004, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753).

[16]  Diego di Bernardo,et al.  Robust Identification of Large Genetic Networks , 2003, Pacific Symposium on Biocomputing.

[17]  Tommi S. Jaakkola,et al.  Combining Location and Expression Data for Principled Discovery of Genetic Regulatory Network Models , 2001, Pacific Symposium on Biocomputing.

[18]  V. Anne Smith,et al.  Influence of Network Topology and Data Collection on Network Inference , 2003, Pacific Symposium on Biocomputing.