Screenit: Visual Analysis of Cellular Screens

High-throughput and high-content screening enables large scale, cost-effective experiments in which cell cultures are exposed to a wide spectrum of drugs. The resulting multivariate data sets have a large but shallow hierarchical structure. The deepest level of this structure describes cells in terms of numeric features that are derived from image data. The subsequent level describes enveloping cell cultures in terms of imposed experiment conditions (exposure to drugs). We present Screenit, a visual analysis approach designed in close collaboration with screening experts. Screenit enables the navigation and analysis of multivariate data at multiple hierarchy levels and at multiple levels of detail. Screenit integrates the interactive modeling of cell physical states (phenotypes) and the effects of drugs on cell cultures (hits). In addition, quality control is enabled via the detection of anomalies that indicate low-quality data, while providing an interface that is designed to match workflows of screening experts. We demonstrate analyses for a real-world data set, CellMorph, with 6 million cells across 20,000 cell cultures.

[1]  Taosheng Chen,et al.  Data Analysis Approaches in High Throughput Screening , 2013 .

[2]  Jarke J. van Wijk,et al.  Small Multiples, Large Singles: A New Approach for Visual Data Exploration , 2013, Comput. Graph. Forum.

[3]  Wes McKinney,et al.  pandas: a Foundational Python Library for Data Analysis and Statistics , 2011 .

[4]  S. Haggarty,et al.  Automated Structure–Activity Relationship Mining , 2014, Journal of biomolecular screening.

[5]  Hanspeter Pfister,et al.  LineUp: Visual Analysis of Multi-Attribute Rankings , 2013, IEEE Transactions on Visualization and Computer Graphics.

[6]  D Lansing Taylor,et al.  Past, present, and future of high content screening and the field of cellomics. , 2007, Methods in molecular biology.

[7]  Dieter Schmalstieg,et al.  ConTour: Data-Driven Exploration of Multi-Relational Datasets for Drug Discovery , 2014, IEEE Transactions on Visualization and Computer Graphics.

[8]  M. Boutros,et al.  Clustering phenotype populations by genome-wide RNAi and multiparametric imaging , 2010, Molecular systems biology.

[9]  Ivan Herman,et al.  Graph Visualization and Navigation in Information Visualization: A Survey , 2000, IEEE Trans. Vis. Comput. Graph..

[10]  Rachel J. Errington,et al.  A Survey of Visualization for Live Cell Imaging , 2017, Comput. Graph. Forum.

[11]  J. Bajorath,et al.  Data structures and computational tools for the extraction of SAR information from large compound sets. , 2010, Drug discovery today.

[12]  Alain Calvet,et al.  Molecular Property eXplorer: A Novel Approach to Visualizing SAR Using Tree-Maps and Heatmaps , 2005, J. Chem. Inf. Model..

[13]  Tamara Munzner,et al.  Matches, Mismatches, and Methods: Multiple-View Workflows for Energy Portfolio Analysis , 2016, IEEE Transactions on Visualization and Computer Graphics.

[14]  Polina Golland,et al.  CellProfiler Analyst: data exploration and analysis software for complex image-based screens , 2008, BMC Bioinformatics.

[15]  Enrico Bertini,et al.  INFUSE: Interactive Feature Selection for Predictive Modeling of High Dimensional Data , 2014, IEEE Transactions on Visualization and Computer Graphics.

[16]  Dorit Merhof,et al.  HiTSEE KNIME: a visualization tool for hit selection and analysis in high-throughput screening experiments for the KNIME platform , 2012, BMC Bioinformatics.

[17]  Jarke J. van Wijk,et al.  Comparison of Multiple Weighted Hierarchies: Visual Analytics for Microbe Community Profiling , 2011, Comput. Graph. Forum.

[18]  Marc Streit,et al.  Opening the Black Box: Strategies for Increased User Involvement in Existing Algorithm Implementations , 2014, IEEE Transactions on Visualization and Computer Graphics.

[19]  Pak Chung Wong,et al.  30 Years of Multidimensional Multivariate Visualization , 1994, Scientific Visualization.

[20]  Anne E Carpenter,et al.  CellProfiler: image analysis software for identifying and quantifying cell phenotypes , 2006, Genome Biology.

[21]  Tamara Munzner,et al.  Design Study Methodology: Reflections from the Trenches and the Stacks , 2012, IEEE Transactions on Visualization and Computer Graphics.

[22]  J.C. Roberts,et al.  State of the Art: Coordinated & Multiple Views in Exploratory Visualization , 2007, Fifth International Conference on Coordinated and Multiple Views in Exploratory Visualization (CMV 2007).

[23]  Roy A. Ruddle,et al.  Visualization of Parameter Space for Image Analysis , 2011, IEEE Transactions on Visualization and Computer Graphics.

[24]  Rosane Minghim,et al.  An Approach to Supporting Incremental Visual Data Classification , 2015, IEEE Transactions on Visualization and Computer Graphics.

[25]  Heidrun Schumann,et al.  The Design Space of Implicit Hierarchy Visualization: A Survey , 2011, IEEE Transactions on Visualization and Computer Graphics.

[26]  Andreas Holzinger,et al.  On Computationally-Enhanced Visual Analysis of Heterogeneous Data and Its Application in Biomedical Informatics , 2014, Interactive Knowledge Discovery and Data Mining in Biomedical Informatics.

[27]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[28]  Pauli Rämö,et al.  CellClassifier: supervised learning of cellular phenotypes , 2009, Bioinform..

[29]  R. Wollman,et al.  High throughput microscopy: from raw images to discoveries , 2007, Journal of Cell Science.

[30]  Leonore A Herzenberg,et al.  Interpreting flow cytometry data: a guide for the perplexed , 2006, Nature Immunology.

[31]  Tobias Schreck,et al.  Assisted Descriptor Selection Based on Visual Comparative Data Analysis , 2011, Comput. Graph. Forum.

[32]  Polina Golland,et al.  Scoring diverse cellular morphologies in image-based screens with iterative feedback and machine learning , 2009, Proceedings of the National Academy of Sciences.

[33]  Wolfgang Link,et al.  High content screening: seeing is believing. , 2010, Trends in biotechnology.

[34]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[35]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[36]  Martin Graham,et al.  A Survey of Multiple Tree Visualisation , 2010, Inf. Vis..

[37]  Jaegul Choo,et al.  iVisClassifier: An interactive visual analytics system for classification based on supervised dimension reduction , 2010, 2010 IEEE Symposium on Visual Analytics Science and Technology.

[39]  Thierry Dorval,et al.  HCS-Analyzer: open source software for high-content screening data correction and analysis , 2012, Bioinform..

[40]  Helwig Hauser,et al.  Visualization and Visual Analysis of Multifaceted Scientific Data: A Survey , 2013, IEEE Transactions on Visualization and Computer Graphics.

[41]  Jarke J. van Wijk,et al.  BaobabView: Interactive construction and analysis of decision trees , 2011, 2011 IEEE Conference on Visual Analytics Science and Technology (VAST).

[42]  Daniel H Huson,et al.  Microbial community analysis using MEGAN. , 2013, Methods in enzymology.

[43]  Anne E Carpenter,et al.  Increasing the Content of High-Content Screening , 2014, Journal of biomolecular screening.

[44]  Stephan Heyse,et al.  Comprehensive analysis of high-throughput screening data , 2002, SPIE BiOS.

[45]  Gunther Heidemann,et al.  Inter-active learning of ad-hoc classifiers for video visual analytics , 2012, 2012 IEEE Conference on Visual Analytics Science and Technology (VAST).

[46]  Pankaj Kumar,et al.  ScreenSifter: analysis and visualization of RNAi screening data , 2013, BMC Bioinformatics.