Validation of Inference Procedures for Gene Regulatory Networks

The availability of high-throughput genomic data has motivated the development of numerous algorithms to infer gene regulatory networks. The validity of an inference procedure must be evaluated relative to its ability to infer a model network close to the ground-truth network from which the data have been generated. The input to an inference algorithm is a sample set of data and its output is a network. Since input, output, and algorithm are mathematical structures, the validity of an inference algorithm is a mathematical issue. This paper formulates validation in terms of a semi-metric distance between two networks, or the distance between two structures of the same kind deduced from the networks, such as their steady-state distributions or regulatory graphs. The paper sets up the validation framework, provides examples of distance functions, and applies them to some discrete Markov network models. It also considers approximate validation methods based on data for which the generating network is not known, the kind of situation one faces when using real data.

[1]  Ilya Nemenman Information theory, multivariate dependence, and genetic network inference , 2004, ArXiv.

[2]  Xinkun Wang,et al.  An effective structure learning method for constructing gene networks , 2006, Bioinform..

[3]  S. Kauffman Metabolic stability and epigenesis in randomly constructed genetic nets. , 1969, Journal of theoretical biology.

[4]  Stuart A. Kauffman,et al.  ORIGINS OF ORDER , 2019, Origins of Order.

[5]  Alexander J. Hartemink,et al.  Informative Structure Priors: Joint Learning of Dynamic Regulatory Networks from Multiple Types of Data , 2004, Pacific Symposium on Biocomputing.

[6]  D. di Bernardo,et al.  How to infer gene networks from expression profiles , 2007, Molecular systems biology.

[7]  Xiaobo Zhou,et al.  A Bayesian connectivity-based approach to constructing probabilistic gene regulatory networks , 2004, Bioinform..

[8]  Jaakko Astola,et al.  Inference of Genetic Regulatory Networks via Best-Fit Extensions , 2003 .

[9]  Aniruddha Datta,et al.  Generating Boolean networks with a prescribed attractor structure , 2005, Bioinform..

[10]  Edward R. Dougherty,et al.  Steady-state probabilities for attractors in probabilistic Boolean networks , 2005, Signal Process..

[11]  S Fuhrman,et al.  Reveal, a general reverse engineering algorithm for inference of genetic network architectures. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[12]  Aniruddha Datta,et al.  Intervention in Probabilistic Gene Regulatory Networks , 2006 .

[13]  Satoru Miyano,et al.  Identification of Genetic Networks from a Small Number of Gene Expression Patterns Under the Boolean Network Model , 1998, Pacific Symposium on Biocomputing.

[14]  P.D. Cristea,et al.  Genomic signal processing , 2004, 7th Seminar on Neural Network Applications in Electrical Engineering, 2004. NEUREL 2004. 2004.

[15]  Hidde de Jong,et al.  Modeling and Simulation of Genetic Regulatory Systems: A Literature Review , 2002, J. Comput. Biol..

[16]  Edward R. Dougherty,et al.  Inferring gene regulatory networks from time series data using the minimum description length principle , 2006, Bioinform..

[17]  Edward R. Dougherty,et al.  CAN MARKOV CHAIN MODELS MIMIC BIOLOGICAL REGULATION , 2002 .

[18]  Ilya Shmulevich,et al.  On Learning Gene Regulatory Networks Under the Boolean Network Model , 2003, Machine Learning.

[19]  Min Zou,et al.  A new dynamic Bayesian network (DBN) approach for identifying gene regulatory networks from time course microarray data , 2005, Bioinform..

[20]  Edward R. Dougherty,et al.  Probabilistic Boolean networks: a rule-based uncertainty model for gene regulatory networks , 2002, Bioinform..

[21]  Le Yu,et al.  Inference of a Probabilistic Boolean Network from a Single Observed Temporal Sequence , 2007, EURASIP J. Bioinform. Syst. Biol..