Regression plane concept for analysing continuous cellular processes with machine learning

Biological processes are inherently continuous, and the chance of phenotypic discovery is significantly restricted by discretising them. Using multi-parametric active regression we introduce a novel concept to describe and explore biological data in a continuous manner. We have implemented Regression Plane (RP), the first user-friendly discovery tool enabling class-free phenotypic supervised machine learning.

[1]  Noel A. C. Cressie,et al.  Statistics for Spatial Data: Cressie/Statistics , 1993 .

[2]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[3]  A. Nappi,et al.  Superoxide anion generation in Drosophila during melanotic encapsulation of parasites. , 1995, European journal of cell biology.

[4]  F. Frey,et al.  Insect immunity: early events in the encapsulation process of parasitoid (Leptopilina boulardi) eggs in resistant and susceptible strains of Drosophila , 1996, Parasitology.

[5]  R. Lanot,et al.  Postembryonic hematopoiesis in Drosophila. , 2001, Developmental biology.

[6]  V. Hartenstein,et al.  Thicker than blood: conserved mechanisms in Drosophila and vertebrate hematopoiesis. , 2003, Developmental cell.

[7]  Marcus R. Frean,et al.  Dependent Gaussian Processes , 2004, NIPS.

[8]  I. Andó,et al.  Sterile wounding is a minimal and sufficient trigger for a cellular immune response in Drosophila melanogaster. , 2005, Immunology letters.

[9]  Anne E Carpenter,et al.  CellProfiler: image analysis software for identifying and quantifying cell phenotypes , 2006, Genome Biology.

[10]  Pekka Ruusuvuori,et al.  Computational Framework for Simulating Fluorescence Microscope Images With Cell Populations , 2007, IEEE Transactions on Medical Imaging.

[11]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[12]  Deborah F. Swayne,et al.  Data Visualization With Multidimensional Scaling , 2008 .

[13]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[14]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[15]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[16]  Bernd Fischer,et al.  CellCognition: time-resolved phenotype annotation in high-throughput live cell imaging , 2010, Nature Methods.

[17]  Jarkko Venna,et al.  Information Retrieval Perspective to Nonlinear Dimensionality Reduction for Data Visualization , 2010, J. Mach. Learn. Res..

[18]  Martin Stofanko,et al.  Lineage Tracing of Lamellocytes Demonstrates Drosophila Macrophage Plasticity , 2010, PloS one.

[19]  V. Honti,et al.  Cell lineage tracing reveals the plasticity of the hemocyte lineages and of the hematopoietic compartments in Drosophila melanogaster. , 2010, Molecular immunology.

[20]  Grigorios Tsoumakas,et al.  MULAN: A Java Library for Multi-Label Learning , 2011, J. Mach. Learn. Res..

[21]  I. Sbalzarini,et al.  Histone Deacetylase 8 Is Required for Centrosome Cohesion and Influenza A Virus Entry , 2011, PLoS pathogens.

[22]  P. Kroeger,et al.  Transcriptional regulation of eater gene expression in Drosophila blood cells , 2012, Genesis.

[23]  Grigorios Tsoumakas,et al.  Multi-target regression via input space expansion: treating targets as inputs , 2012, Machine Learning.

[24]  Ying Liu,et al.  Real time prediction for converter gas tank levels based on multi-output least square support vector regressor , 2012 .

[25]  Christoph Sommer,et al.  Machine learning in cell biology – teaching computers to recognize phenotypes , 2013, Journal of Cell Science.

[26]  V. Honti,et al.  The cell-mediated immunity of Drosophila melanogaster: hemocyte lineages, immune compartments, microanatomy and regulation. , 2014, Developmental and comparative immunology.

[27]  Peter Horvath,et al.  Active Learning Strategies for Phenotypic Profiling of High-Content Screens , 2014, Journal of biomolecular screening.

[28]  A. Hamsten,et al.  TM6SF2 is a regulator of liver fat metabolism influencing triglyceride secretion and hepatic lipid droplet content , 2014, Proceedings of the National Academy of Sciences.

[29]  Sara M. Willems,et al.  The impact of low-frequency and rare variants on lipid levels , 2015, Nature Genetics.

[30]  D. Pe’er,et al.  Trajectories of cell-cycle progression from fixed cell populations , 2015, Nature Methods.

[31]  Concha Bielza,et al.  A survey on multi‐output regression , 2015, WIREs Data Mining Knowl. Discov..

[32]  Yunpeng Li,et al.  CIDRE: an illumination-correction method for optical microscopy , 2015, Nature Methods.

[33]  Grigorios Tsoumakas,et al.  Multi-target regression via input space expansion: treating targets as inputs , 2012, Machine Learning.

[34]  Ines Anderl,et al.  Transdifferentiation and Proliferation in Two Distinct Hemocyte Lineages in Drosophila melanogaster Larvae after Wasp Infection , 2016, PLoS pathogens.

[35]  Lassi Paavolainen,et al.  Data-analysis strategies for image-based cell profiling , 2017, Nature Methods.

[36]  Lassi Paavolainen,et al.  Advanced Cell Classifier: User-Friendly Machine-Learning-Based Software for Discovering Phenotypes in High-Content Imaging Data. , 2017, Cell systems.

[37]  Mark Craven,et al.  A review of active learning approaches to experimental design for uncovering biological networks , 2017, PLoS Comput. Biol..

[38]  Daniel A. Keim,et al.  What you see is what you can change: Human-centered machine learning by interactive visualization , 2017, Neurocomputing.

[39]  Hossein Azizpour,et al.  Phenotypic Image Analysis Software Tools for Exploring and Understanding Big Image Data from Cell-Based Assays. , 2018, Cell systems.

[40]  A. van Oudenaarden,et al.  Single-Cell Transcriptomics Meets Lineage Tracing. , 2018, Cell stem cell.

[41]  O. Joseph Trask,et al.  Concerns, challenges and promises of high-content analysis of 3D cellular models , 2018, Nature Reviews Drug Discovery.

[42]  J. Olzmann,et al.  Dynamics and functions of lipid droplets , 2018, Nature Reviews Molecular Cell Biology.

[43]  Antonio Z. Politi,et al.  Experimental and computational framework for a dynamic protein atlas of human cell division , 2018, Nature.

[44]  Pierre B. Cattenoz,et al.  Temporal specificity and heterogeneity of the fly immune cells’ transcriptional landscape , 2019, bioRxiv.

[45]  J. Jung,et al.  Transmembrane 4 L Six Family Member 5 Senses Arginine for mTORC1 Signaling. , 2019, Cell metabolism.

[46]  William Graf,et al.  Deep learning for cellular image analysis , 2019, Nature Methods.

[47]  Michael J. Steinbaugh,et al.  A single-cell survey of Drosophila blood , 2019, bioRxiv.

[48]  Lassi Paavolainen,et al.  nucleAIzer: A Parameter-free Deep Learning Framework for Nucleus Segmentation Using Image Style Transfer , 2020, Cell systems.

[49]  Michael J. Steinbaugh,et al.  A single-cell survey of Drosophila blood , 2020, eLife.

[50]  Pierre B. Cattenoz,et al.  Temporal specificity and heterogeneity of Drosophila immune cells , 2020, The EMBO journal.

[51]  Atul Gupta,et al.  Active Learning Query Strategies for Classification, Regression, and Clustering: A Survey , 2020, Journal of Computer Science and Technology.