Combining gene ontology with deep neural networks to enhance the clustering of single cell RNA-Seq data

BackgroundSingle cell RNA sequencing (scRNA-seq) is applied to assay the individual transcriptomes of large numbers of cells. The gene expression at single-cell level provides an opportunity for better understanding of cell function and new discoveries in biomedical areas. To ensure that the single-cell based gene expression data are interpreted appropriately, it is crucial to develop new computational methods.ResultsIn this article, we try to re-construct a neural network based on Gene Ontology (GO) for dimension reduction of scRNA-seq data. By integrating GO with both unsupervised and supervised models, two novel methods are proposed, named GOAE (Gene Ontology AutoEncoder) and GONN (Gene Ontology Neural Network) respectively.ConclusionsThe evaluation results show that the proposed models outperform some state-of-the-art dimensionality reduction approaches. Furthermore, incorporating with GO, we provide an opportunity to interpret the underlying biological mechanism behind the neural network-based model.

[1]  Stan Matwin,et al.  simDEF: definition-based semantic similarity measure of gene ontology terms for functional similarity analysis of genes , 2015, Bioinform..

[2]  Jie Sun,et al.  DincRNA: a comprehensive web-based bioinformatics toolkit for exploring disease associations and ncRNA function , 2018, Bioinform..

[3]  S. Teichmann,et al.  Computational and analytical challenges in single-cell transcriptomics , 2015, Nature Reviews Genetics.

[4]  J. Palermo-neto,et al.  Effects of Unilateral Cervical Vagotomy on Murine Dendritic Cells , 2015 .

[5]  Yadong Wang,et al.  Extending gene ontology with gene association networks , 2016, Bioinform..

[6]  M. Schaub,et al.  SC3 - consensus clustering of single-cell RNA-Seq data , 2016, Nature Methods.

[7]  John N. Weinstein,et al.  PathwaysWeb: a gene pathways API with directional interactions, expanded gene ontology, and versioning , 2016, Bioinform..

[8]  Chris Mungall,et al.  AmiGO: online access to ontology and annotation data , 2008, Bioinform..

[9]  Yadong Wang,et al.  A novel method to measure the semantic similarity of HPO terms , 2017, Int. J. Data Min. Bioinform..

[10]  Qinghua Guo,et al.  LncRNA2Target v2.0: a comprehensive database for target genes of lncRNAs in human and mouse , 2018, Nucleic Acids Res..

[11]  M. Cugmas,et al.  On comparing partitions , 2015 .

[12]  Aleksandra A. Kolodziejczyk,et al.  The technology and biology of single-cell RNA sequencing. , 2015, Molecular cell.

[13]  Jiajie Peng,et al.  Measuring phenotype-phenotype similarity through the interactome , 2017, BMC Bioinformatics.

[14]  Michael Q. Zhang,et al.  Network embedding-based representation learning for single cell RNA-seq data , 2017, Nucleic acids research.

[15]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[16]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[17]  Chen Xu,et al.  Identification of cell types from single-cell transcriptomes using a novel clustering method , 2015, Bioinform..

[18]  I. Amit,et al.  Massively Parallel Single-Cell RNA-Seq for Marker-Free Decomposition of Tissues into Cell Types , 2014, Science.

[19]  E. Pierson,et al.  ZIFA: Dimensionality reduction for zero-inflated single-cell gene expression analysis , 2015, Genome Biology.

[20]  Alex A. Pollen,et al.  Low-coverage single-cell mRNA sequencing reveals cellular heterogeneity and activated signaling pathways in developing cerebral cortex , 2014, Nature Biotechnology.

[21]  H. Ueda,et al.  Erratum to: Quartz-Seq: a highly reproducible and sensitive single-cell RNA sequencing method, reveals non-genetic gene-expression heterogeneity , 2017, Genome Biology.

[22]  Z. Bar-Joseph,et al.  Using neural networks for reducing the dimensions of single-cell RNA-Seq data , 2017, Nucleic acids research.

[23]  Catalin C. Barbacioru,et al.  mRNA-Seq whole-transcriptome analysis of a single cell , 2009, Nature Methods.

[24]  Christopher Yau,et al.  pcaReduce: hierarchical clustering of single cell transcriptional profiles , 2015, BMC Bioinformatics.

[25]  Carsten Peterson,et al.  Simulating the Mammalian Blastocyst - Molecular and Mechanical Interactions Pattern the Embryo , 2011, PLoS Comput. Biol..

[26]  James Bailey,et al.  Information Theoretic Measures for Clusterings Comparison: Variants, Properties, Normalization and Correction for Chance , 2010, J. Mach. Learn. Res..

[27]  N. Neff,et al.  Quantitative assessment of single-cell RNA-sequencing methods , 2013, Nature Methods.

[28]  Jeong Eon Lee,et al.  Single-cell RNA-seq enables comprehensive tumour and immune cell profiling in primary breast cancer , 2017, Nature Communications.

[29]  M. Gerstein,et al.  RNA-Seq: a revolutionary tool for transcriptomics , 2009, Nature Reviews Genetics.

[30]  Rona S. Gertner,et al.  Single-cell transcriptomics reveals bimodality in expression and splicing in immune cells , 2013, Nature.

[31]  Yi Zhang,et al.  Average Precision , 2009, Encyclopedia of Database Systems.

[32]  Liang Cheng,et al.  Identification of Alzheimer's Disease-Related Genes Based on Data Integration Method , 2019, Front. Genet..

[33]  Roded Sharan,et al.  Using deep learning to model the hierarchical structure and function of a cell , 2018, Nature Methods.

[34]  A. Oudenaarden,et al.  Nature, Nurture, or Chance: Stochastic Gene Expression and Its Consequences , 2008, Cell.

[35]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[36]  Carolina Perez-Iratxeta,et al.  Gene function in early mouse embryonic stem cell differentiation , 2007, BMC Genomics.

[37]  Shuhui Liu,et al.  Improving the measurement of semantic similarity by combining gene ontology and co-functional network: a random walk based approach , 2018, BMC Systems Biology.

[38]  Fabian J Theis,et al.  Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells , 2015, Nature Biotechnology.

[39]  R. Sandberg,et al.  Single-Cell RNA-Seq Reveals Dynamic, Random Monoallelic Gene Expression in Mammalian Cells , 2014, Science.

[40]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..