Adaptive informatics for multi-factorial and high content biological data

Whereas genomic data are universally machine-readable, data from imaging, multiplex biochemistry, flow cytometry and other cell- and tissue-based assays usually reside in loosely organized files of poorly documented provenance. This arises because the relational databases used in genomic research are difficult to adapt to rapidly evolving experimental designs, data formats and analytic algorithms. Here we describe an adaptive approach to managing experimental data based on semantically typed data hypercubes (SDCubes) that combine hierarchical data format 5 (HDF5) and extensible markup language (XML) file types. We demonstrate the application of SDCube-based storage using ImageRail, a software package for high-throughput microscopy. Experimental design and its day-to-day evolution, not rigid standards, determine how ImageRail data are organized in SDCubes. We applied ImageRail to collect and analyze drug dose-response landscapes in human cell lines at single-cell resolution.

[1]  R. Weinberg,et al.  Cancer stem cells: mirage or reality? , 2009, Nature Medicine.

[2]  Anne E Carpenter,et al.  CellProfiler: free, versatile software for automated biological image analysis. , 2007, BioTechniques.

[3]  J. Pawley,et al.  Handbook of Biological Confocal Microscopy , 1990, Springer US.

[4]  B. Masters,et al.  Handbook of Biological Confocal Microscopy, Third Edition , 2008 .

[5]  Chris Allan,et al.  Open tools for storage and management of quantitative image data. , 2008, Methods in cell biology.

[6]  S. Gabriel,et al.  EGFR Mutations in Lung Cancer: Correlation with Clinical Response to Gefitinib Therapy , 2004, Science.

[7]  Gavin MacBeath,et al.  State-based discovery: a multidimensional screen for small-molecule modulators of EGF signaling , 2006, Nature Methods.

[8]  Matthew R Clutter,et al.  High-content single-cell drug screening with phosphospecific flow cytometry. , 2008, Nature chemical biology.

[9]  Henry S. Rzepa,et al.  Chemical Markup, XML, and the World Wide Web. 4. CML Schema , 2003, J. Chem. Inf. Comput. Sci..

[10]  Michael D. Abràmoff,et al.  Image processing with ImageJ , 2004 .

[11]  Sabrina L Spencer,et al.  Non-genetic Cell-to-cell Variability and the Consequences for Pharmacology This Review Comes from a Themed Issue on Omics Edited the Distribution of Protein Abundance and Resulting Variability in Phenotype Measuring Cell-to-cell Variation , 2022 .

[12]  P. Sorger,et al.  Non-genetic origins of cell-to-cell variability in TRAIL-induced apoptosis , 2009, Nature.

[13]  P. Sorger,et al.  Dissecting Variability in Responses to Cancer Chemotherapy Through Systems Pharmacology , 2010, Clinical pharmacology and therapeutics.

[14]  D. Lauffenburger,et al.  A Compendium of Signals and Responses Triggered by Prodeath and Prosurvival Cytokines*S , 2005, Molecular & Cellular Proteomics.

[15]  Garry P Nolan,et al.  Fluorescent cell barcoding in flow cytometry allows high-throughput drug screening and signaling profiling , 2006, Nature Methods.

[16]  R. Germain,et al.  Variability and Robustness in T Cell Activation from Regulated Heterogeneity in Protein Levels , 2008, Science.

[17]  D. Lauffenburger,et al.  Input–output behavior of ErbB signaling pathways as revealed by a mass action model trained against dynamic data , 2009, Molecular systems biology.

[18]  G. Tortora,et al.  EGFR antagonists in cancer treatment. , 2008, The New England journal of medicine.

[19]  C. Conrad,et al.  Automated microscopy for high-content RNAi screening , 2010, The Journal of cell biology.

[20]  G. Tortora,et al.  Antitumor effect and potentiation of cytotoxic drugs activity in human cancer cells by ZD-1839 (Iressa), an epidermal growth factor receptor-selective tyrosine kinase inhibitor. , 2000, Clinical cancer research : an official journal of the American Association for Cancer Research.

[21]  Bart S. Hendriks,et al.  DataPflex: a MATLAB-based tool for the manipulation and visualization of multidimensional datasets , 2010, Bioinform..

[22]  Y. Yarden,et al.  Untangling the ErbB signalling network , 2001, Nature Reviews Molecular Cell Biology.

[23]  John G. Albeck,et al.  Collecting and organizing systematic sets of protein data , 2006, Nature Reviews Molecular Cell Biology.

[24]  Johannes Goll,et al.  The Diatom EST Database , 2004, Nucleic Acids Res..

[25]  Gavin MacBeath,et al.  Dissecting protein function and signaling using protein microarrays. , 2009, Current opinion in chemical biology.

[26]  Wen-Lin Kuo,et al.  A collection of breast cancer cell lines for the study of functionally distinct cancer subtypes. , 2006, Cancer cell.

[27]  Henry S. Rzepa,et al.  Chemical Markup, XML and the World-Wide Web. 2. Information Objects and the CMLDOM , 2001, J. Chem. Inf. Comput. Sci..

[28]  Erez Zadok,et al.  Unifying biological image formats with HDF5 , 2009, CACM.

[29]  Alan P. Brown,et al.  Pharmacodynamic and toxicokinetic evaluation of the novel MEK inhibitor, PD0325901, in the rat following oral and intravenous administration , 2007, Cancer Chemotherapy and Pharmacology.

[30]  B. Erovic,et al.  Effects of Epidermal Growth Factor and Keratinocyte Growth Factor on the Growth of Oropharyngeal Keratinocytes in Coculture with Autologous Fibroblasts in a Three-Dimensional Matrix , 2006, Cells Tissues Organs.

[31]  J. Kendrew,et al.  Tumor penetration of gefitinib (Iressa), an epidermal growth factor receptor tyrosine kinase inhibitor , 2005, Molecular Cancer Therapeutics.

[32]  Luca Toschi,et al.  Preexistence and clonal selection of MET amplification in EGFR mutant NSCLC. , 2010, Cancer cell.

[33]  D. Lauffenburger,et al.  Networks Inferred from Biochemical Data Reveal Profound Differences in Toll-like Receptor and Inflammatory Signaling between Normal and Transformed Hepatocytes* , 2010, Molecular & Cellular Proteomics.

[34]  Tom H. Pringle,et al.  The human genome browser at UCSC. , 2002, Genome research.

[35]  Susan Landau,et al.  Communications surveillance , 2009, Commun. ACM.

[36]  Douglas A. Creager,et al.  The Open Microscopy Environment (OME) Data Model and XML file: open tools for informatics and quantitative analysis in biological imaging , 2005, Genome Biology.

[37]  Matthew A. Hibbs,et al.  Visualization of omics data for systems biology , 2010, Nature Methods.

[38]  Nigel W. Hardy,et al.  Promoting coherent minimum reporting guidelines for biological and biomedical investigations: the MIBBI project , 2008, Nature Biotechnology.

[39]  Lani F. Wu,et al.  Image-based multivariate profiling of drug responses from single cells , 2007, Nature Methods.

[40]  P. Liberali,et al.  Population context determines cell-to-cell variability in endocytosis and virus infection , 2009, Nature.

[41]  Julio Saez-Rodriguez,et al.  Flexible informatics for linking experimental data to mathematical models via DataRail , 2008, Bioinform..

[42]  Ben S. Wittner,et al.  A Chromatin-Mediated Reversible Drug-Tolerant State in Cancer Cell Subpopulations , 2010, Cell.