Translating Time-Course Gene Expression Profiles into Semi-algebraic Hybrid Automata Via Dimensionality Reduction

Biotechnological innovations which sample gene expressions allow to measure the gene expression levels of a biological system with varying degree of accuracy, cost and speed. By repeating the measurement steps at different sampling rates, one can both infer relations among the genes and define a dynamic model of the underlying biological system. When a very large number of genes and measurements are involved, they raise several difficult algorithmic questions, as accurate model-building, checking and inference tasks. Semi-algebraic hybrid automata were proposed as a modeling formalism for biological systems (see, e.g., [17,6]), and demonstrated their abilities to handle complex biochemical pathways. This paper proposes an automatic procedure to build semi-algebraic hybrid automata from gene-expression profiles. In order to reduce the size of the resulting automata and to minimize their analysis computational complexity, our approach exploits various dimensionality reduction techniques. The paper concludes with several experimental results about peach fruit.

[1]  Ronald W. Davis,et al.  Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray , 1995, Science.

[2]  S. Pennington,et al.  Arrays for protein expression profiling: Towards a viable alternative to two‐dimensional gel electrophoresis? , 2001, Proteomics.

[3]  W. Bialek,et al.  Information-based clustering. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Björn Sjögreen,et al.  The real-time polymerase chain reaction. , 2006, Molecular aspects of medicine.

[5]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[6]  Susmita Datta,et al.  Comparisons and validation of statistical clustering techniques for microarray gene expression data , 2003, Bioinform..

[7]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[8]  H. Anai,et al.  Algebraic approach to analysis of discrete-time polynomial systems , 1999, 1999 European Control Conference (ECC).

[9]  B. Charrier,et al.  Real-time PCR: what relevance to plant studies? , 2004, Journal of experimental botany.

[10]  A. Casagrande,et al.  Semi-Algebraic Constant Reset Hybrid Automata - SACoRe , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.

[11]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[12]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[13]  Thomas A. Henzinger,et al.  Hybrid Automata: An Algorithmic Approach to the Specification and Verification of Hybrid Systems , 1992, Hybrid Systems.

[14]  Robert L. Grossman,et al.  Timed Automata , 1999, CAV.

[15]  Tommi S. Jaakkola,et al.  A new approach to analyzing gene expression time series data , 2002, RECOMB '02.

[16]  Ziv Bar-Joseph,et al.  Analyzing time series gene expression data , 2004, Bioinform..

[17]  Arlindo L. Oliveira,et al.  Biclustering algorithms for biological data analysis: a survey , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[18]  Carla Piazza,et al.  CIMS-TR 2005-859 Algorithmic Algebraic Model Checking I: The Case of Biochemical Systems and their Reachability Analysis ? , 2005 .

[19]  S. Bustin Absolute quantification of mRNA using real-time reverse transcription polymerase chain reaction assays. , 2000, Journal of molecular endocrinology.

[20]  Carla Piazza,et al.  Algorithmic Algebraic Model Checking I: Challenges from Systems Biology , 2005, CAV.

[21]  S. Shankar Sastry,et al.  O-Minimal Hybrid Systems , 2000, Math. Control. Signals Syst..