Problems for Structure Learning Aggregation and Computational Complexity

Machine learning methods to find graphical models of genetic regulatory networks from cDNA microarray data have become increasingly popular in recent years. We provide three reasons to question the reliability of such methods: (1) a major theoretical challenge to any method using conditional independence relations; (2) a simulation study using realistic data that confirms the importance of the theoretical challenge; and (3) an analysis of the computational complexity of algorithms that avoid this theoretical challenge. We have no proof that one cannot possibly learn the structure of a genetic regulatory network from microarray data alone, nor do we think that such a proof is likely. However, the combination of (i) fundamental challenges from theory, (ii) practical evidence that those challenges arise in realistic data, and (iii) the difficulty of avoiding those challenges leads us to conclude that it is unlikely that current microarray technology will ever be successfully applied to this structure learning problem.

[1]  Anna Helena Reali Costa,et al.  Mapping with Monocular Vision in Two Dimensions , 2010, Int. J. Nat. Comput. Res..

[2]  T. Chu,et al.  Limitations of Statistical Learning from Gene Expression Data , 2004 .

[3]  Patrik D'haeseleer,et al.  Linear Modeling of mRNA Expression Levels During CNS Development and Injury , 1998, Pacific Symposium on Biocomputing.

[4]  George Baciu,et al.  Cognitive location-aware information retrieval by agent-based semantic matching , 2009, 2009 8th IEEE International Conference on Cognitive Informatics.

[5]  Hiroaki Kitano,et al.  The DBRF Method for Inferring a Gene Network from Large-scale Steady-state Gene Expression Data , 2000 .

[6]  Marcel J. T. Reinders,et al.  Linear Modeling of Genetic Networks from Experimental Data , 2000, ISMB.

[7]  Amos Tanay,et al.  Minreg: Inferring an active regulator set , 2002, ISMB.

[8]  L. Hood,et al.  A Genomic Regulatory Network for Development , 2002, Science.

[9]  S Miyano,et al.  Algorithms for inferring qualitative models of biological networks. , 2000, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[10]  Krzysztof Ciesielski,et al.  Applying the Immunological Network Concept to Clustering Document Collections , 2009 .

[11]  Jean-Philippe Rennard,et al.  Handbook of Research on Nature-inspired Computing for Economics and Management , 2006 .

[12]  P. Spirtes,et al.  Causation, Prediction, and Search, 2nd Edition , 2001 .

[13]  Edward R. Dougherty,et al.  Probabilistic Boolean networks: a rule-based uncertainty model for gene regulatory networks , 2002, Bioinform..

[14]  Thomas Lengauer,et al.  Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology , 1998, ISMB 1999.

[15]  International Journal of Natural Computing Research , 2022 .

[16]  E. Davidson,et al.  Genomic cis-regulatory logic: experimental and computational analysis of a sea urchin gene. , 1998, Science.

[17]  Yuichiro Kitajima,et al.  Causation and Intervention in Algebraic Quantum Field Theory , 2010 .

[18]  Robert H Singer,et al.  Single-Cell Gene Expression Profiling , 2002, Science.

[19]  Gary D. Stormo,et al.  Modeling Regulatory Networks with Weight Matrices , 1998, Pacific Symposium on Biocomputing.

[20]  S Fuhrman,et al.  Reveal, a general reverse engineering algorithm for inference of genetic network architectures. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[21]  Roger E Bumgarner,et al.  Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. , 2001, Science.

[22]  Paul Woafo,et al.  Modelling Biological Systems , 2010 .

[23]  Gregory F. Cooper,et al.  Discovery of Causal Relationships in a Gene-Regulation Pathway from a Mixture of Experimental and Observational DNA Microarray Data , 2001, Pacific Symposium on Biocomputing.

[24]  Stephanie Forrest,et al.  Reconstructing gene networks from large scale gene expression data , 2000 .

[25]  A. Hartemink Bayesian Inference for Gene Expression and Proteomics: Bayesian Networks and Informative Priors: Transcriptional Regulatory Network Models , 2006 .

[26]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[27]  Thomas S. Richardson,et al.  A Discovery Algorithm for Directed Cyclic Graphs , 1996, UAI.

[28]  David Danks,et al.  Linearity Properties of Bayes Nets with Binary Variables , 2001, UAI.

[29]  J. Ross,et al.  A Test Case of Correlation Metric Construction of a Reaction Pathway from Measurements , 1997 .

[30]  Eric H Davidson,et al.  New computational approaches for analysis of cis-regulatory networks. , 2002, Developmental biology.

[31]  Pat Langley,et al.  Guiding Revision of Regulatory Models with Expression Data , 2002, Pacific Symposium on Biocomputing.

[32]  Richard Scheines,et al.  A Statistical Problem for Inference to Regulatory Structure from Associations of Gene Expression Measurements with Microarrays , 2003, Bioinform..

[33]  Patrik D'haeseleer,et al.  Genetic network inference: from co-expression clustering to reverse engineering , 2000, Bioinform..

[34]  Nir Friedman,et al.  Learning Bayesian Network Structure from Massive Datasets: The "Sparse Candidate" Algorithm , 1999, UAI.

[35]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[36]  Amos Tanay,et al.  MinReg: A Scalable Algorithm for Learning Parsimonious Regulatory Networks in Yeast and Mammals , 2006, J. Mach. Learn. Res..

[37]  V. Thorsson,et al.  Discovery of regulatory interactions through perturbation: inference and experimental design. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[38]  P. Swain,et al.  Stochastic Gene Expression in a Single Cell , 2002, Science.

[39]  Eleonora Bilotta,et al.  Cellular Automata and Complex Systems: Methods for Modeling Biological Phenomena , 2010 .

[40]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[41]  P. Swain,et al.  Gene Regulation at the Single-Cell Level , 2005, Science.

[42]  Christopher Meek,et al.  Learning Bayesian Networks with Discrete Variables from Data , 1995, KDD.

[43]  David Danks,et al.  The Computational and Experimental Complexity of Gene Perturbations for Regulatory Network Search , 2003 .

[44]  Michael L. Bittner,et al.  Growing genetic regulatory networks from seed genes , 2004, Bioinform..

[45]  Nicola J. Rinaldi,et al.  Transcriptional Regulatory Networks in Saccharomyces cerevisiae , 2002, Science.

[46]  Yunlong Wang,et al.  Generic Cabling of Intelligent Buildings Based on Ant Colony Algorithm , 2011, Int. J. Softw. Sci. Comput. Intell..

[47]  John Q. Trojanowski,et al.  Single-Cell Gene Expression Analysis: Implications for Neurodegenerative and Neuropsychiatric Disorders , 2004, Neurochemical Research.