Big data and single cell transcriptomics: implications for ontological representation

Cells are fundamental functional units of multicellular organisms, with different cell types playing distinct physiological roles in the body. The recent advent of single cell transcriptional profiling using RNA sequencing is producing “big data”, enabling the identification of novel human cell types at an unprecedented rate. In this review, we summarize recent work characterizing cell types in the human central nervous and immune systems using single cell and single nuclei RNA sequencing, and discuss the implications that these discoveries are having on the representation of cell types in the reference Cell Ontology (CL). We propose a method based on random forest machine learning for identifying sets of necessary and sufficient marker genes that can be used to assemble consistent and reproducible cell type definitions for incorporation into the CL. The representation of defined cell type classes and their relationships in the CL using this strategy will make the cell type classes findable, accessible, interoperable, and reusable (FAIR), allowing the CL to serve as a reference knowledgebase of information about the role that distinct cellular phenotypes play in human health and disease.

[1]  Evan Z. Macosko,et al.  Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets , 2015, Cell.

[2]  Charles H. Yoon,et al.  Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq , 2016, Science.

[3]  M. Ronaghi,et al.  Neuronal subtypes and diversity revealed by single-nucleus RNA sequencing of the human brain , 2016, Science.

[4]  Michael Hawrylycz,et al.  Transcriptomic Perspectives on Neocortical Structure, Development, Evolution, and Disease. , 2017, Annual review of neuroscience.

[5]  Evan Z. Macosko,et al.  Comprehensive Classification of Retinal Bipolar Neurons by Single-Cell Transcriptomics , 2016, Cell.

[6]  Sara B. Linker,et al.  Nuclear RNA-seq of single neurons reveals molecular signatures of activation , 2016, Nature Communications.

[7]  Evan W. Newell,et al.  Mapping the human DC lineage through the integration of high-dimensional techniques , 2017, Science.

[8]  Brian D. Aevermann,et al.  Cell type discovery and representation in the era of high-content single cell phenotyping , 2017, BMC Bioinformatics.

[9]  Catalin C. Barbacioru,et al.  Tracing the Derivation of Embryonic Stem Cells from the Inner Cell Mass by Single-Cell RNA-Seq Analysis , 2010, Cell stem cell.

[10]  Boxi Kang,et al.  Landscape of Infiltrating T Cells in Liver Cancer Revealed by Single-Cell Sequencing , 2017, Cell.

[11]  S. Quake,et al.  A survey of human brain transcriptome diversity at the single cell level , 2015, Proceedings of the National Academy of Sciences.

[12]  Lars Schmidt-Thieme,et al.  Data Analysis, Machine Learning and Applications - Proceedings of the 31st Annual Conference of the Gesellschaft für Klassifikation e.V., Albert-Ludwigs-Universität Freiburg, March 7-9, 2007 , 2008, GfKl.

[13]  Alexander D. Diehl,et al.  Logical Development of the Cell Ontology , 2011, BMC Bioinformatics.

[14]  N. Hacohen,et al.  Single-cell RNA-seq reveals new types of human blood dendritic cells, monocytes, and progenitors , 2017, Science.

[15]  Barry Smith,et al.  An improved ontological representation of dendritic cells as a paradigm for all cell types , 2009, BMC Bioinformatics.

[16]  Michael J. T. Stubbington,et al.  Single-cell transcriptomics to explore the immune system in health and disease , 2017, Science.

[17]  Åsa K. Björklund,et al.  The heterogeneity of human CD127+ innate lymphoid cells revealed by single-cell RNA sequencing , 2016, Nature Immunology.

[18]  Alex A. Pollen,et al.  Low-coverage single-cell mRNA sequencing reveals cellular heterogeneity and activated signaling pathways in developing cerebral cortex , 2014, Nature Biotechnology.

[19]  Catalin C. Barbacioru,et al.  mRNA-Seq whole-transcriptome analysis of a single cell , 2009, Nature Methods.

[20]  S. Linnarsson,et al.  Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq , 2015, Science.

[21]  Christopher A Walsh,et al.  Cerebral cortical neuron diversity and development at single-cell resolution , 2017, Current Opinion in Neurobiology.

[22]  F. Gage,et al.  RNA-sequencing from single nuclei , 2013, Proceedings of the National Academy of Sciences.

[23]  Christof Koch,et al.  Adult Mouse Cortical Cell Taxonomy by Single Cell Transcriptomics , 2016, Nature Neuroscience.

[24]  Staci A. Sorensen,et al.  Adult Mouse Cortical Cell Taxonomy Revealed by Single Cell Transcriptomics , 2016 .

[25]  Erik Schultes,et al.  The FAIR Guiding Principles for scientific data management and stewardship , 2016, Scientific Data.

[26]  Alan Ruttenberg,et al.  The Cell Ontology 2016: enhanced content, modularization, and ontology interoperability , 2016, J. Biomed. Semant..

[27]  Alexander D. Diehl,et al.  Hematopoietic cell types: Prototype for a revised cell ontology , 2009, J. Biomed. Informatics.

[28]  M. Ashburner,et al.  An ontology for cell types , 2005, Genome Biology.

[29]  Trygve E Bakken,et al.  Transcriptomic and morphophysiological evidence for a specialized human cortical GABAergic cell type , 2017, bioRxiv.

[30]  Xu Zhang,et al.  Somatosensory neuron types identified by high-coverage single-cell RNA-sequencing and functional heterogeneity , 2015, Cell Research.

[31]  Sara B. Linker,et al.  Using single nuclei for RNA-seq to capture the transcriptome of postmortem neurons , 2016, Nature Protocols.

[32]  M. Ashburner,et al.  The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration , 2007, Nature Biotechnology.