CellProfiler and KNIME: open source tools for high content screening.

High content screening (HCS) has established itself in the world of the pharmaceutical industry as an essential tool for drug discovery and drug development. HCS is currently starting to enter the academic world and might become a widely used technology. Given the diversity of problems tackled in academic research, HCS could experience some profound changes in the future, mainly with more imaging modalities and smart microscopes being developed. One of the limitations in the establishment of HCS in academia is flexibility and cost. Flexibility is important to be able to adapt the HCS setup to accommodate the multiple different assays typical of academia. Many cost factors cannot be avoided, but the costs of the software packages necessary to analyze large datasets can be reduced by using Open Source software. We present and discuss the Open Source software CellProfiler for image analysis and KNIME for data analysis and data mining that provide software solutions which increase flexibility and keep costs low.

[1]  Thomas D. Y. Chung,et al.  A Simple Statistical Parameter for Use in Evaluation and Validation of High Throughput Screening Assays , 1999, Journal of biomolecular screening.

[2]  Polina Golland,et al.  CellProfiler Analyst: data exploration and analysis software for complex image-based screens , 2008, BMC Bioinformatics.

[3]  A. Verkman Drug discovery in academia. , 2004, American journal of physiology. Cell physiology.

[4]  R J Blackwell,et al.  Digital image processing technology and its application in forensic sciences. , 1975, Journal of forensic sciences.

[5]  J. Gulledge Debt crisis: Crunch time for US science , 2011, Nature.

[6]  P. Hunter Facing the credit crunch , 2010, EMBO Reports.

[7]  C. Spearman The proof and measurement of association between two things. By C. Spearman, 1904. , 1987, The American journal of psychology.

[8]  Rudy Moddemeijer,et al.  A statistic to estimate the variance of the histogram-based mutual information estimator based on dependent pairs of observations , 1999, Signal Process..

[9]  C. Conrad,et al.  Automated microscopy for high-content RNAi screening , 2010, The Journal of cell biology.

[10]  Polina Golland,et al.  Scoring diverse cellular morphologies in image-based screens with iterative feedback and machine learning , 2009, Proceedings of the National Academy of Sciences.

[11]  Albert Gough,et al.  High-Content Screening: A New Approach to Easing Key Bottlenecks in the Drug Discovery Process , 1997 .

[12]  Maurice G. Kendall,et al.  Randomness and Random Sampling Numbers , 1938 .

[13]  R. V. Powell,et al.  Optics at the jet propulsion laboratory. , 1970, Applied optics.

[14]  Bert Gunter,et al.  Improved Statistical Methods for Hit Selection in High-Throughput Screening , 2003, Journal of biomolecular screening.

[15]  Alessandro D’Ausilio,et al.  Arduino: A low-cost multipurpose lab equipment , 2011, Behavior Research Methods.

[16]  J Richard Archer,et al.  History, evolution, and trends in compound management for high throughput screening. , 2004, Assay and drug development technologies.

[17]  K. Pearson Mathematical Contributions to the Theory of Evolution. III. Regression, Heredity, and Panmixia , 1896 .

[18]  Sitta Sittampalam,et al.  Open access high throughput drug discovery in the public domain: a Mount Everest in the making. , 2010, Current pharmaceutical biotechnology.

[19]  A Ganesan,et al.  Natural products and combinatorial chemistry: back to the future. , 2004, Current opinion in chemical biology.

[20]  L D Harmon,et al.  Picture processing by computer. , 1969, Science.

[21]  Siddharth Suri,et al.  Conducting behavioral research on Amazon’s Mechanical Turk , 2010, Behavior research methods.

[22]  Anne E Carpenter,et al.  Improved structure, function and compatibility for CellProfiler: modular high-throughput image analysis software , 2011, Bioinform..

[23]  L E Lipkin,et al.  Computers in the clinical pathologic laboratory: chemistry and image processing. , 1975, Annual review of biophysics and bioengineering.

[24]  Anne E Carpenter,et al.  CellProfiler: image analysis software for identifying and quantifying cell phenotypes , 2006, Genome Biology.

[25]  Xiaohua Douglas Zhang A pair of new statistical parameters for quality control in RNA interference high-throughput screening assays. , 2007, Genomics.

[26]  P. Mahalanobis On the generalized distance in statistics , 1936 .

[27]  David J Newman,et al.  Natural products as sources of new drugs over the period 1981-2002. , 2003, Journal of natural products.

[28]  Paul A Johnston,et al.  Identifying Actives from HTS Data Sets , 2011, Journal of biomolecular screening.

[29]  Anne E Carpenter,et al.  Workflow and Metrics for Image Quality Control in Large-Scale High-Content Screens , 2012, Journal of biomolecular screening.

[30]  Yuriy Alexandrov,et al.  Angiogenesis: an improved in vitro biological system and automated image-based workflow to aid identification and characterization of angiogenesis and angiogenic modulators. , 2008, Assay and drug development technologies.

[31]  D. Cressey Pfizer slashes R&D , 2011, Nature.

[32]  Aideen Long,et al.  Statistical methods for analysis of high-throughput RNA interference screens , 2009, Nature Methods.