Regularization and Noise Injection for Improving Genetic Network Models

The most fundamental problem in genetic network modeling is generally known as the dimensionality problem. Typical time-course gene expression data sets contain measurements of thousands of genes taken over fewer than twenty time-steps. A large dynamic network cannot be learned from data with such a limited number of time-steps without the use of additional constraints, preferably derived from biological knowledge. In this chapter, we present an approach that can nd rough estimates of the underlying genetic network based on limited time-course gene expression data by employing the fact that gene expression measurements are relatively noisy and genetic networks are thought to be robust. The method expands the data set by adding noisy duplicates, thereby simultaneously tackling the dimensionality problem and making the solutions more robust against (the already large) noise in the data. For linear models, this concept is strongly related to shrinkage methods, such as ridge regression and lasso regression, and in the limiting case equivalent to the Moore-Penrose pseudoinverse. The strength of the proposed concept of noise injection lies, however, in the fact that it can be employed to any modelling approach, including non-linear models.

[1]  E. Davidson,et al.  The hardwiring of development: organization and function of genomic regulatory systems. , 1997, Development.

[2]  Marcel J. T. Reinders,et al.  A Comparison of Genetic Network Models , 2000, Pacific Symposium on Biocomputing.

[3]  Yves Grandvalet Least Absolute Shrinkage is Equivalent to Quadratic Penalization , 1998 .

[4]  J. Barker,et al.  Large-scale temporal gene expression mapping of central nervous system development. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[5]  A. E. Hoerl,et al.  Ridge Regression: Applications to Nonorthogonal Problems , 1970 .

[6]  Patrik D'haeseleer,et al.  Linear Modeling of mRNA Expression Levels During CNS Development and Injury , 1998, Pacific Symposium on Biocomputing.

[7]  J. Ross,et al.  A Test Case of Correlation Metric Construction of a Reaction Pathway from Measurements , 1997 .

[8]  Ting Chen,et al.  Modeling Gene Expression with Differential Equations , 1998, Pacific Symposium on Biocomputing.

[9]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[10]  Marcel J. T. Reinders,et al.  Genetic network models: a comparative study , 2001, SPIE BiOS.

[11]  Steven Skiena,et al.  Identifying gene regulatory networks from experimental data , 2001, Parallel Comput..

[12]  Gary D. Stormo,et al.  Modeling Regulatory Networks with Weight Matrices , 1998, Pacific Symposium on Biocomputing.

[13]  M Wahde,et al.  Coarse-grained reverse engineering of genetic regulatory networks. , 2000, Bio Systems.

[14]  E. P. van Someren Searching for Limited Connectivity in Genetic Network Models , 2004 .

[15]  Christopher M. Bishop,et al.  Current address: Microsoft Research, , 2022 .

[16]  Lawrence Hunter,et al.  Pacific symposium on biocomputing 2006 , 2005, PSB 2016.

[17]  S Fuhrman,et al.  Reveal, a general reverse engineering algorithm for inference of genetic network architectures. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[18]  Marcel J. T. Reinders,et al.  Linear Modeling of Genetic Networks from Experimental Data , 2000, ISMB.