Cell type discovery and representation in the era of high-content single cell phenotyping

BackgroundA fundamental characteristic of multicellular organisms is the specialization of functional cell types through the process of differentiation. These specialized cell types not only characterize the normal functioning of different organs and tissues, they can also be used as cellular biomarkers of a variety of different disease states and therapeutic/vaccine responses. In order to serve as a reference for cell type representation, the Cell Ontology has been developed to provide a standard nomenclature of defined cell types for comparative analysis and biomarker discovery. Historically, these cell types have been defined based on unique cellular shapes and structures, anatomic locations, and marker protein expression. However, we are now experiencing a revolution in cellular characterization resulting from the application of new high-throughput, high-content cytometry and sequencing technologies. The resulting explosion in the number of distinct cell types being identified is challenging the current paradigm for cell type definition in the Cell Ontology.ResultsIn this paper, we provide examples of state-of-the-art cellular biomarker characterization using high-content cytometry and single cell RNA sequencing, and present strategies for standardized cell type representations based on the data outputs from these cutting-edge technologies, including “context annotations” in the form of standardized experiment metadata about the specimen source analyzed and marker genes that serve as the most useful features in machine learning-based cell type classification models. We also propose a statistical strategy for comparing new experiment data to these standardized cell type representations.ConclusionThe advent of high-throughput/high-content single cell technologies is leading to an explosion in the number of distinct cell types being identified. It will be critical for the bioinformatics community to develop and adopt data standard conventions that will be compatible with these new technologies and support the data representation needs of the research community. The proposals enumerated here will serve as a useful starting point to address these challenges.

[1]  Hirokazu Chiba,et al.  CELLPEDIA: a repository for human cell information for cell studies and differentiation analyses , 2011, Database J. Biol. Databases Curation.

[2]  Nigel W. Hardy,et al.  Promoting coherent minimum reporting guidelines for biological and biomedical investigations: the MIBBI project , 2008, Nature Biotechnology.

[3]  Alan Ruttenberg,et al.  The Cell Ontology 2016: enhanced content, modularization, and ontology interoperability , 2016, J. Biomed. Semant..

[4]  G. Nolan,et al.  Mass Cytometry: Single Cells, Many Features , 2016, Cell.

[5]  Alexander D. Diehl,et al.  Hematopoietic cell types: Prototype for a revised cell ontology , 2009, J. Biomed. Informatics.

[6]  Yu Qian,et al.  Mapping cell populations in flow cytometry data for cross‐sample comparison using the Friedman–Rafsky test statistic as a distance measure , 2015, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[7]  J. P. McCoy,et al.  Standardizing Flow Cytometry Immunophenotyping Analysis from the Human ImmunoPhenotyping Consortium , 2016, Scientific Reports.

[8]  Jessica A. Turner,et al.  The Ontology for Biomedical Investigations , 2016, PloS one.

[9]  M. Ashburner,et al.  An ontology for cell types , 2005, Genome Biology.

[10]  J. Michael Cherry,et al.  Ontology application and use at the ENCODE DCC , 2015, Database J. Biol. Databases Curation.

[11]  Jeffrey A. Wiser,et al.  ImmPort: disseminating data to the public for the future of immunology , 2014, Immunologic Research.

[12]  M. Ashburner,et al.  The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration , 2007, Nature Biotechnology.

[13]  Michel Dumontier,et al.  Relations as patterns: bridging the gap between OBO and OWL , 2010, BMC Bioinformatics.

[14]  Bernhard M. Schuldt,et al.  A bioinformatic assay for pluripotency in human cells , 2011, Nature Methods.

[15]  Barry Smith,et al.  An improved ontological representation of dendritic cells as a paradigm for all cell types , 2009, BMC Bioinformatics.

[16]  Raphael Gottardo,et al.  flowCL: ontology-based cell population labelling in flow cytometry , 2015, Bioinform..

[17]  N. Hacohen,et al.  Single-cell RNA-seq reveals new types of human blood dendritic cells, monocytes, and progenitors , 2017, Science.

[18]  Alexander D. Diehl,et al.  Logical Development of the Cell Ontology , 2011, BMC Bioinformatics.

[19]  R. Scheuermann,et al.  Elucidation of seventeen human peripheral blood B‐cell subsets and quantification of the tetanus response using a density‐based method for the automated identification of cell populations in multidimensional flow cytometry data , 2010, Cytometry. Part B, Clinical cytometry.

[20]  Alan Ruttenberg,et al.  Taking shortcuts with OWL using safe macros , 2010 .

[21]  T. Meehan,et al.  An atlas of active enhancers across human cell types and tissues , 2014, Nature.