"Don't Care" Modeling: A Logical Framework for Developing Predictive System Models

Analysis of biological data often requires an understanding of components of pathways and/or networks and their mutual dependency relationships. Such systems are often analyzed and understood from datasets made up of the states of the relevant components and a set of discrete outcomes or results. The analysis of these systems can be assisted by models that are consistent with the available data while being maximally predictive for untested conditions. Here, we present a method to construct such models for these types of systems. To maximize predictive capability, we introduce a set of "don't care" (dc) Boolean variables that must be assigned values in order to obtain a concrete model. When a dc variable is set to 1, this indicates that the information from the corresponding component does not contribute to the observed result. Intuitively, more dc variables that are set to 1 maximizes both the potential predictive capability as well as the possibility of obtaining an inconsistent model. We thus formulate our problemas maximizing the number of dc variables that are set to 1, while retaining a model solution that is consistent and can explain all the given known data. This amounts to solving a quantified Boolean formula (QBF) with three levels of quantifier alternations, with a maximization goal for the dc variables. We have developed a prototype implementation to support our new modeling approach and are applying our method to part of a classical system in developmental biology describing fate specification of vulval precursor cells in the C. elegans nematode. Our work indicates that biological instances can serve as challenging and complex benchmarks for the formal-methods research community.

[1]  Tom M. Mitchell,et al.  Generalization as Search , 2002 .

[2]  David Harel,et al.  Computational insights into Caenorhabditis elegans vulval development. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Donald W. Loveland,et al.  A machine program for theorem-proving , 2011, CACM.

[4]  Sharad Malik,et al.  Conflict driven learning in a quantified Boolean Satisfiability solver , 2002, ICCAD 2002.

[5]  Ian Stark,et al.  The Continuous pi-Calculus: A Process Algebra for Biochemical Modelling , 2008, CMSB.

[6]  Edmund M. Clarke,et al.  Automatic Verification of Sequential Circuits Using Temporal Logic , 1986, IEEE Transactions on Computers.

[7]  Hilary Putnam,et al.  A Computing Procedure for Quantification Theory , 1960, JACM.

[8]  Amir Pnueli,et al.  Formal Modeling of C. elegans Development: A Scenario-Based Approach , 2003, CMSB.

[9]  Ryszard S. Michalski,et al.  A theory and methodology of inductive learning , 1993 .

[10]  Paul W. Sternberg,et al.  Lateral inhibition during vulval induction in Caenorhabditis elegans , 1988, Nature.

[11]  Paul W. Sternberg,et al.  Pattern formation during vulval development in C. elegans , 1986, Cell.

[12]  Paul W. Sternberg,et al.  The combined action of two intercellular signaling pathways specifies three cell fates during vulval induction in C. elegans , 1989, Cell.

[13]  Paul W Sternberg,et al.  Genetic dissection of developmental pathways. , 2006, WormBook : the online review of C. elegans biology.

[14]  Paul W Sternberg,et al.  Intercellular coupling amplifies fate segregation during Caenorhabditis elegans vulval development , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Sharad Malik,et al.  A Comparative Study of 2QBF Algorithms , 2004, SAT.

[16]  Marek A. Perkowski,et al.  Multi-valued functional decomposition as a machine learning method , 1998, Proceedings. 1998 28th IEEE International Symposium on Multiple- Valued Logic (Cat. No.98CB36138).

[17]  Amir Pnueli,et al.  A Platform for Combining Deductive with Algorithmic Verification , 1996, CAV.

[18]  Randal E. Bryant,et al.  Graph-Based Algorithms for Boolean Function Manipulation , 1986, IEEE Transactions on Computers.

[19]  P W Sternberg,et al.  Genetic dissection of developmental pathways. , 1995, Methods in cell biology.

[20]  Mark L. Axtell,et al.  On using logic synthesis for supervised classification learning , 1995, Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence.

[21]  David Harel,et al.  Computational Insights into C. elegans Vulval Development , 2005 .