Pattern recognition of gene expression data on biochemical networks with simple wavelet transforms

Biological networks show a rather complex, scale-free topology consisting of few highly connected (hubs) and many low connected (peripheric and concatenating) nodes. Furthermore, they contain regions of rather high connectivity, as in e.g. metabolic pathways. To analyse data for an entire network consisting of several thousands of nodes and vertices is not manageable. This inspired us to divide the network into functionally coherent sub-graphs and analysing the data that correspond to each of these sub-graphs individually. We separated the network in a two-fold way: 1. clustering approach: sub-graphs were defined by higher connected regions using a clustering procedure on the network; and 2. connected edge approach: paths of concatenated edges connecting striking combinations of the data were selected and taken as sub-graphs for further analysis. As experimental data we used gene expression data of the bacterium Escherichia coli which was exposed to two distinctive environments: oxygen rich and oxygen deprived. We mapped the data onto the corresponding biochemical network and extracted disciminating features using Haar wavelet transforms for both strategies. In comparison to standard methods, our approaches yielded a much more consistent image of the changed regulation in the cells. In general, our concept may be transferred to network analyses on any interaction data, when data for two comparable states of the associated nodes are made available.

[1]  Gerhard Reinelt,et al.  Discovering functional gene expression patterns in the metabolic network of Escherichia coli with wavelets transforms , 2006, BMC Bioinformatics.

[2]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[3]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[4]  C. Bonferroni Il calcolo delle assicurazioni su gruppi di teste , 1935 .

[5]  Roland Eils,et al.  Group testing for pathway analysis improves comparability of different microarray datasets , 2006, Bioinform..

[6]  F. Young Biochemistry , 1955, The Indian Medical Gazette.

[7]  Benno Schwikowski,et al.  Discovering regulatory and signalling circuits in molecular interaction networks , 2002, ISMB.

[8]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[9]  Peter D. Karp,et al.  EcoCyc: a comprehensive database resource for Escherichia coli , 2004, Nucleic Acids Res..

[10]  Roland Eils,et al.  Gene expression analysis on biochemical networks with the potts spin model , 2003, German Conference on Bioinformatics.

[11]  Frédéric Barras,et al.  Quinolinate synthetase, an iron–sulfur enzyme in NAD biosynthesis , 2005, FEBS letters.

[12]  Roland Eils,et al.  Gene expression signature predicting pathologic complete response with gemcitabine, epirubicin, and docetaxel in primary breast cancer. , 2006, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[13]  Markus J. Herrgård,et al.  Integrating high-throughput and computational data elucidates bacterial networks , 2004, Nature.

[14]  Daniel Hanisch,et al.  Co-clustering of biological networks and gene expression data , 2002, ISMB.

[15]  Peter D. Karp,et al.  The MetaCyc Database , 2002, Nucleic Acids Res..

[16]  George Stephanopoulos,et al.  Mapping physiological states from microarray expression measurements , 2002, Bioinform..

[17]  Wolfgang Huber,et al.  A Compendium to Ensure Computational Reproducibility in High-Dimensional Classification Tasks , 2004, Statistical applications in genetics and molecular biology.

[18]  J. Collins,et al.  Inferring Genetic Networks and Identifying Compound Mode of Action via Expression Profiling , 2003, Science.

[19]  Martin Vingron,et al.  Variance stabilization applied to microarray data calibration and to the quantification of differential expression , 2002, ISMB.

[20]  B. Ripley,et al.  Pattern Recognition , 1968, Nature.

[21]  Thomas Lengauer,et al.  Analysis of Gene Expression Data with Pathway Scores , 2000, ISMB.

[22]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[23]  S. Lyles Biology of microorganisms , 1969 .

[24]  D. Botstein,et al.  Genomic expression programs in the response of yeast cells to environmental changes. , 2000, Molecular biology of the cell.