论文信息 - Workflow-Based Software Environment for Large-Scale Biological Experiments

Workflow-Based Software Environment for Large-Scale Biological Experiments

High-content screening (HCS) technologies are becoming increasingly used in both large-scale drug discovery and basic research programs. These automated imaging and analysis technologies enable the researcher to elucidate the complex biology that underlies the functions of genes, proteins, and other biomolecules at the cellular level. HCS combines the power of automated digital microscopy and advanced software-based image analysis algorithms to detect and quantify biological changes in cells and tissues. This technology is a particularly powerful tool when used to interrogate the cellular effects of exogenously applied agents such as RNAi and/or small molecules. HCS allows for the evaluation of cellular perturbations that occur both at the level of the single cell and within cellular populations. In a multivariate approach, multiple cellular parameters are collected, allowing for more complex analysis. However, in these scenarios, data flow and management still represent substantial bottlenecks in HCS projects. HCS data include a diversity of information from multiple sources such as details pertaining to screening libraries (e.g., siRNA and small molecules), image stacks acquired from automated microscopes (of which there may be up to several million), and the image analysis data. From this, postprocessing algorithms are required to generate statistical, quality control bioinformatic information and ultimately a final hit list. To accomplish these individual tasks, numerous tools can be used to perform each analytical step; however, management of the entire information flow currently requires the use of commercially available proprietary software, the scope of which is often limited, or bespoke customized scripts. In this article, the authors introduce an open-source research tool that allows for the management of the entire data flow of the HCS data chain, by handling and linking information and providing many powerful postprocessing and visualization tools.

[1] A. Dove. High-throughput screening goes to school , 2007, Nature Methods.

[2] Anne E Carpenter,et al. CellProfiler: image analysis software for identifying and quantifying cell phenotypes , 2006, Genome Biology.

[3] Thorsten Meinl,et al. KNIME - the Konstanz information miner: version 2.0 and beyond , 2009, SKDD.

[4] David Rogers,et al. Cheminformatics analysis and learning in a data pipelining environment , 2006, Molecular Diversity.

[5] Aideen Long,et al. Statistical methods for analysis of high-throughput RNA interference screens , 2009, Nature Methods.

[6] Nicolas Fay,et al. The role of the informatics framework in early lead discovery. , 2006, Drug discovery today.

[7] Wolfgang Huber,et al. Analysis of cell-based RNAi screens , 2006, Genome Biology.

[8] Thorsten Meinl,et al. KNIME: The Konstanz Information Miner , 2007, GfKl.