Channeling the Data Flood: Handling Large-Scale Biomolecular Measurements in Silico

The cells of the human body each contain thousands of molecular species,the dynamic interactions of which constitute the biomolecular system underlying cellular function. Many thousands of expressed genes, proteins, and metabolites can be measured simultaneously in a tissue or blood sample, making a scan of most biomolecules possible. This wealth of information causes the bottleneck in biomedical research to shift from making measurements to data integration and analysis. Management of biomolecular information requires efficient data storage and retrieval from integrated databases. Subsequent data analysis has three components: 1) identification of broad molecular "fingerprints" useful for early disease diagnosis and treatment selection; 2) identification of the relatively few molecular signals which change significantly above the high noise level formed by biological variation amongst individuals; and 3) development of a mechanistic understanding of the system by capturing its characteristics in computational models. At present, extensive parts of the biomolecular system are still uncharted. Representation of the entire system in silico presents a great challenge, made more challenging still by great interindividual variation at the molecular level. The prospect of much better control of human disease makes this gigantic enterprise worthwhile.

[1]  Charles R Cantor,et al.  A high-throughput gene expression analysis technique using competitive PCR and matrix-assisted laser desorption ionization time-of-flight MS , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[2]  S. Havlin,et al.  Scale-free networks are ultrasmall. , 2002, Physical review letters.

[3]  Johannes H G M van Beek,et al.  Myocardial O2 consumption in porcine left ventricle is heterogeneously distributed in parallel to heterogeneous O2 delivery. , 2004, American journal of physiology. Heart and circulatory physiology.

[4]  Geoffrey B. Nilsen,et al.  Whole-Genome Patterns of Common DNA Variation in Three Human Populations , 2005, Science.

[5]  Eugene Berezikov,et al.  Camels and zebrafish, viruses and cancer: a microRNA update. , 2005, Human molecular genetics.

[6]  Van,et al.  A gene-expression signature as a predictor of survival in breast cancer. , 2002, The New England journal of medicine.

[7]  J. Foekens,et al.  Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer , 2005, The Lancet.

[8]  Lissa Harris The DNA microarray , 2005 .

[9]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[10]  Adrian E. Raftery,et al.  Donuts, scratches and blanks: robust model-based segmentation of microarray images , 2005, Bioinform..

[11]  Fahmeed Hyder,et al.  In vivo NMR studies of the glutamate neurotransmitter flux and neuroenergetics: implications for brain function. , 2003, Annual review of physiology.

[12]  Gavin Sherlock,et al.  Of fish and chips , 2005, Nature Methods.

[13]  R. Warnke,et al.  Immune signatures in follicular lymphoma. , 2005, The New England journal of medicine.

[14]  Ernst Wit,et al.  Statistics for microarrays , 2004 .

[15]  T. Kuhn,et al.  The Structure of Scientific Revolutions. , 1964 .

[16]  Ronald W. Davis,et al.  Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray , 1995, Science.

[17]  C R Cantor,et al.  Chip-based genotyping by mass spectrometry. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[18]  P. Whitfield,et al.  Horizons in Nutritional Science Metabolomics: an emerging post-genomic tool for nutrition , 2004 .

[19]  Christopher H. Bryant,et al.  Functional genomic hypothesis generation and experimentation by a robot scientist , 2004, Nature.

[20]  Trey Ideker,et al.  Building with a scaffold: emerging strategies for high- to low-level cellular modeling. , 2003, Trends in biotechnology.

[21]  Matej Oresic,et al.  Integrative biological analysis of the APOE*3-leiden transgenic mouse. , 2004, Omics : a journal of integrative biology.

[22]  D. Hinkle,et al.  Methodological Considerations Regarding Single-Cell Gene Expression Profiling for Brain Injury , 2004, Neurochemical Research.

[23]  Members of the Complex Trait Consortium Standardizing global gene expression analysis between laboratories and across platforms , 2005 .

[24]  Michaela Scherr,et al.  MicroRNA and lung cancer. , 2005, The New England journal of medicine.

[25]  Roger E Bumgarner,et al.  Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. , 2001, Science.

[26]  Teresa K. Attwood,et al.  Introduction to Bioinformatics , 2001 .

[27]  L. Harris Data stands up to SCRUTINY , 2005 .

[28]  S. L. Wong,et al.  Towards a proteome-scale map of the human protein–protein interaction network , 2005, Nature.

[29]  E. D. Harris Differential PCR and DNA microarrays: the modern era of nutritional investigations. , 2000, Nutrition.

[30]  E. Schadt,et al.  Dark matter in the genome: evidence of widespread transcription detected by microarray tiling experiments. , 2005, Trends in genetics : TIG.

[31]  Mehmet Toner,et al.  Application of genome-wide expression analysis to human health and disease. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[32]  A. Sherry,et al.  Analysis of tricarboxylic acid cycle of the heart using 13C isotope isomers. , 1990, The American journal of physiology.

[33]  Ash A. Alizadeh,et al.  Prediction of survival in diffuse large-B-cell lymphoma based on the expression of six genes. , 2004, The New England journal of medicine.

[34]  E. Lewandowski,et al.  Mitochondrial transporter responsiveness and metabolic flux homeostasis in postischemic hearts. , 1999, The American journal of physiology.

[35]  S. P. Fodor,et al.  Blocks of Limited Haplotype Diversity Revealed by High-Resolution Scanning of Human Chromosome 21 , 2001, Science.

[36]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[37]  Ash A. Alizadeh,et al.  Individuality and variation in gene expression patterns in human blood , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[38]  Jane Fridlyand,et al.  Bladder Cancer Stage and Outcome by Array-Based Comparative Genomic Hybridization , 2005, Clinical Cancer Research.

[39]  L. Hood,et al.  A Genomic Regulatory Network for Development , 2002, Science.

[40]  T. Kuhn The structure of scientific revolutions, 3rd ed. , 1996 .

[41]  Johannes H. G. M. van Beek Data integration and analysis for medical systems biology. , 2004 .

[42]  Bryan Frank,et al.  Independence and reproducibility across microarray platforms , 2005, Nature Methods.

[43]  Robert Tibshirani,et al.  Immune signatures in follicular lymphoma. , 2005, The New England journal of medicine.

[44]  H. Horvitz,et al.  MicroRNA expression profiles classify human cancers , 2005, Nature.

[45]  W. Kuo,et al.  High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarrays , 1998, Nature Genetics.

[46]  Yoav Benjamini,et al.  Identifying differentially expressed genes using false discovery rate controlling procedures , 2003, Bioinform..

[47]  Neal O. Jeffries,et al.  Algorithms for alignment of mass spectrometry proteomic data , 2005, Bioinform..

[48]  Nicola J. Rinaldi,et al.  Transcriptional Regulatory Networks in Saccharomyces cerevisiae , 2002, Science.

[49]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[50]  Deirdre R. Meldrum,et al.  Life-on-a-chip , 2003, Nature Reviews Microbiology.

[51]  J. Eberwine,et al.  Expression profiling of small cellular samples in cancer: less is more , 2004, British Journal of Cancer.