Jenkins-CI, an Open-Source Continuous Integration System, as a Scientific Data and Image-Processing Platform

High-throughput screening generates large volumes of heterogeneous data that require a diverse set of computational tools for management, processing, and analysis. Building integrated, scalable, and robust computational workflows for such applications is challenging but highly valuable. Scientific data integration and pipelining facilitate standardized data processing, collaboration, and reuse of best practices. We describe how Jenkins-CI, an “off-the-shelf,” open-source, continuous integration system, is used to build pipelines for processing images and associated data from high-content screening (HCS). Jenkins-CI provides numerous plugins for standard compute tasks, and its design allows the quick integration of external scientific applications. Using Jenkins-CI, we integrated CellProfiler, an open-source image-processing platform, with various HCS utilities and a high-performance Linux cluster. The platform is web-accessible, facilitates access and sharing of high-performance compute resources, and automates previously cumbersome data and image-processing tasks. Imaging pipelines developed using the desktop CellProfiler client can be managed and shared through a centralized Jenkins-CI repository. Pipelines and managed data are annotated to facilitate collaboration and reuse. Limitations with Jenkins-CI (primarily around the user interface) were addressed through the selection of helper plugins from the Jenkins-CI community.

[1]  Polina Golland,et al.  CellProfiler Analyst: data exploration and analysis software for complex image-based screens , 2008, BMC Bioinformatics.

[2]  T. Schirris,et al.  Development and validation of a high-content screening in vitro micronucleus assay in CHO-k1 and HepG2 cells. , 2011, Mutation research.

[3]  Polina Golland,et al.  An image analysis toolbox for high-throughput C. elegans assays , 2012, Nature Methods.

[4]  Carole A. Goble,et al.  Taverna: a tool for building and running workflows of services , 2006, Nucleic Acids Res..

[5]  Daniel J. Blankenberg,et al.  Galaxy: A Web‐Based Genome Analysis Tool for Experimentalists , 2010, Current protocols in molecular biology.

[6]  John P A Ioannidis,et al.  Improving Validation Practices in “Omics” Research , 2011, Science.

[7]  R. Peng Reproducible Research in Computational Science , 2011, Science.

[8]  P. Selzer,et al.  Differentiation and Visualization of Diverse Cellular Phenotypic Responses in Primary High-Content Screening , 2012, Journal of biomolecular screening.

[9]  John A. Tallarico,et al.  Multi-parameter phenotypic profiling: using cellular effects to characterize small-molecule compounds , 2009, Nature Reviews Drug Discovery.

[10]  D C Swinney,et al.  Phenotypic vs. Target‐Based Drug Discovery for First‐in‐Class Medicines , 2013, Clinical pharmacology and therapeutics.

[11]  Anne E Carpenter,et al.  Improved structure, function and compatibility for CellProfiler: modular high-throughput image analysis software , 2011, Bioinform..

[12]  I. Nolte,et al.  Spiral ganglion neuron quantification in the guinea pig cochlea using Confocal Laser Scanning Microscopy compared to embedding methods , 2013, Hearing Research.

[13]  Polina Golland,et al.  Scoring diverse cellular morphologies in image-based screens with iterative feedback and machine learning , 2009, Proceedings of the National Academy of Sciences.

[14]  T. Bouwmeester,et al.  Activation of Yap-Directed Transcription by Knockdown of Conserved Cellular Functions , 2016, Journal of biomolecular screening.

[15]  Roger S. Pressman,et al.  Software Engineering: A Practitioner's Approach , 1982 .

[16]  Anne E Carpenter,et al.  CellProfiler: image analysis software for identifying and quantifying cell phenotypes , 2006, Genome Biology.

[17]  Bernd Rinn,et al.  openBIS: a flexible framework for managing and analyzing complex data in biology research , 2011, BMC Bioinformatics.

[18]  Marc Bickle,et al.  CellProfiler and KNIME: open source tools for high content screening. , 2013, Methods in molecular biology.

[19]  Arthur W Toga,et al.  The LONI Pipeline Processing Environment , 2003, NeuroImage.

[20]  M. Schwab,et al.  An Open Source Based High Content Screening Method for Cell Biology Laboratories Investigating Cell Spreading and Adhesion , 2013, PloS one.