Sparsely-connected autoencoder (SCA) for single cell RNAseq data mining

Single-cell RNA sequencing (scRNAseq) is an essential tool to investigate cellular heterogeneity. Although scRNAseq has some technical challenges, it would be of great interest being able to disclose biological information out of cell subpopulations, which can be defined by cluster analysis of scRNAseq data. In this manuscript, we evaluated the efficacy of sparsely-connected autoencoder (SCA) as tool for the functional mining of single cells clusters. We show that SCA can be uses as tool to uncover hidden features associated to scRNAseq data. Our approach is strengthened by two metrics, QCF and QCM, which respectively allow to evaluate the ability of SCA to reconstruct a cells cluster and to evaluate the overall quality of the neural network model. Our data indicate that SCA encoded spaces, derived by different experimentally validated data (TFs targets, miRNAs targets, Kinases targets, and cancer-related immune signatures), can be used to grasp single cell cluster-specific functional features. In our implementation, SCA efficacy comes from its ability to reconstruct only specific clusters, thus indicating only those clusters where the SCA encoding space is a key element for cells aggregation. SCA analysis is implemented as module in rCASC framework and it is supported by a GUI to simplify it usage for biologists and medical personnel.

[1]  Hsien-Da Huang,et al.  RegPhos 2.0: an updated resource to explore protein kinase–substrate phosphorylation networks in mammals , 2014, Database J. Biol. Databases Curation.

[2]  N. Haddock,et al.  Allograft Inflammatory Factor-1 Governs Hematopoietic Stem Cell Differentiation Into cDC1 and Monocyte-Derived Dendritic Cells Through IRF8 and RelB in vitro , 2019, Front. Immunol..

[3]  Shibiao Wan,et al.  SHARP: hyperfast and accurate processing of single-cell RNA-seq data via ensemble random projection , 2020, Genome research.

[4]  Conor Delaney,et al.  Combinatorial prediction of marker panels from single‐cell transcriptomic data , 2019, bioRxiv.

[5]  Bo Wang,et al.  Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning , 2016, Nature Methods.

[6]  Hsien-Da Huang,et al.  miRTarBase update 2018: a resource for experimentally validated microRNA-target interactions , 2017, Nucleic Acids Res..

[7]  Jian Ma,et al.  S100A8/A9 in Inflammation , 2018, Front. Immunol..

[8]  Jiachun Lu,et al.  MicroRNA-191, regulated by HIF-2α, is involved in EMT and acquisition of a stem cell-like phenotype in arsenite-transformed human liver epithelial cells. , 2017, Toxicology in vitro : an international journal published in association with BIBRA.

[9]  Fang Wang,et al.  KSRP specifies monocytic and granulocytic differentiation through regulating miR-129 biogenesis and RUNX1 expression , 2017, Nature Communications.

[10]  Marco Beccuti,et al.  rCASC: reproducible Classification Analysis of Single Cell sequencing data , 2018, bioRxiv.

[11]  C. Civin,et al.  C/EBPα directs monocytic commitment of primary myeloid progenitors , 2006 .

[12]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[13]  D. Tenen,et al.  C/EBPβ is required for survival of Ly6C- monocytes. , 2017, Blood.

[14]  K. Sakamoto,et al.  The Role of the Transcription Factor CREB in Immune Function , 2010, The Journal of Immunology.

[15]  Ji-hang Ju,et al.  Downregulation of NFAT5 by RNA interference reduces monoclonal antibody productivity of hybridoma cells , 2007, Cell Research.

[16]  O. Farouk,et al.  TFF1 and TFF3 mRNAs Are Higher in Blood from Breast Cancer Patients with Metastatic Disease than Those without , 2018, Journal of oncology.

[17]  Jingshu Wang,et al.  Data denoising with transfer learning in single-cell transcriptomics , 2019, Nature Methods.

[18]  T. M. Murali,et al.  Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data , 2019, Nature Methods.

[19]  Kun Huang,et al.  BERMUDA: A novel deep transfer learning method for single-cell RNA sequencing batch correction reveals hidden high-resolution cellular subtypes , 2019 .

[20]  Joseph N. Pucella,et al.  Distinct Requirements of CHD4 during B Cell Development and Antibody Response , 2019, Cell reports.

[21]  Hyojin Kim,et al.  TRRUST v2: an expanded reference database of human and mouse transcriptional regulatory interactions , 2017, Nucleic Acids Res..

[22]  Fabian J Theis,et al.  SCANPY: large-scale single-cell gene expression data analysis , 2018, Genome Biology.

[23]  Alexander LeNail,et al.  Shallow Sparsely-Connected Autoencoders for Gene Set Projection , 2019, PSB.

[24]  M. Busslinger,et al.  Pax5: the guardian of B cell identity and function , 2007, Nature Immunology.

[25]  Fei Wang,et al.  miRTarBase 2020: updates to the experimentally validated microRNA–target interaction database , 2019, Nucleic Acids Res..

[26]  Z. Bar-Joseph,et al.  Using neural networks for reducing the dimensions of single-cell RNA-Seq data , 2017, Nucleic acids research.

[27]  Minoru Kanehisa,et al.  KEGG: new perspectives on genomes, pathways, diseases and drugs , 2016, Nucleic Acids Res..

[28]  M J Hendzel,et al.  CBP, a transcriptional coactivator and acetyltransferase. , 2001, Biochemistry and cell biology = Biochimie et biologie cellulaire.

[29]  Elmar Eisemann,et al.  Approximated and User Steerable tSNE for Progressive Visual Analytics , 2015, IEEE Transactions on Visualization and Computer Graphics.

[30]  Kai Wang,et al.  SOX5 promotes breast cancer proliferation and invasion by transactivation of EZH2 , 2019, Oncology letters.

[31]  Rui Li,et al.  Imputation of single-cell gene expression with an autoencoder neural network , 2018, bioRxiv.

[32]  Brad T. Sherman,et al.  DAVID: Database for Annotation, Visualization, and Integrated Discovery , 2003, Genome Biology.

[33]  Grace X. Y. Zheng,et al.  Massively parallel digital transcriptional profiling of single cells , 2016, Nature Communications.

[34]  S. Cross,et al.  Expression of S100 proteins in normal human tissues and common cancers using tissue microarrays: S100A6, S100A8, S100A9 and S100A11 are all overexpressed in common cancers , 2005, Histopathology.

[35]  C. Civin,et al.  C/EBPalpha directs monocytic commitment of primary myeloid progenitors. , 2006, Blood.

[36]  Marzia Rossato,et al.  IL-10–induced microRNA-187 negatively regulates TNF-α, IL-6, and IL-12p40 production in TLR4-stimulated monocytes , 2012, Proceedings of the National Academy of Sciences.

[37]  Markus Rempfler,et al.  Self-organization and symmetry breaking in intestinal organoid development , 2019, Nature.

[38]  Paul Hoffman,et al.  Integrating single-cell transcriptomic data across different conditions, technologies, and species , 2018, Nature Biotechnology.

[39]  Hsien-Da Huang,et al.  miRTarBase 2016: updates to the experimentally validated miRNA-target interactions database , 2015, Nucleic Acids Res..

[40]  Fabian J Theis,et al.  Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells , 2015, Nature Biotechnology.

[41]  Xing Li,et al.  High expressions of LDHA and AMPK as prognostic biomarkers for breast cancer. , 2016, Breast.

[42]  Jean Yee Hwa Yang,et al.  Autoencoder-based cluster ensembles for single-cell RNA-seq data analysis , 2019, BMC Bioinformatics.