A New Graph Autoencoder-Based Consensus-Guided Model for scRNA-seq Cell Type Detection

Single-cell RNA sequencing (scRNA-seq) technology is famous for providing a microscopic view to help capture cellular heterogeneity. This characteristic has advanced the field of genomics by enabling the delicate differentiation of cell types. However, the properties of single-cell datasets, such as high dropout events, noise, and high dimensionality, are still a research challenge in the single-cell field. To utilize single-cell data more efficiently and to better explore the heterogeneity among cells, a new graph autoencoder (GAE)-based consensus-guided model (scGAC) is proposed in this article. The data are preprocessed into multiple top-level feature datasets. Then, feature learning is performed by using GAEs to generate new feature matrices, followed by similarity learning based on distance fusion methods. The learned similarity matrices are fed back to the GAEs to guide their feature learning process. Finally, the abovementioned steps are iterated continuously to integrate the final consistent similarity matrix and perform other related downstream analyses. The scGAC model can accurately identify critical features and effectively preserve the internal structure of the data. This can further improve the accuracy of cell type identification.

[1]  Chang Tang,et al.  Cross-View Locality Preserved Diversity and Consensus Learning for Multi-View Unsupervised Feature Selection , 2022, IEEE Transactions on Knowledge and Data Engineering.

[2]  C. Zheng,et al.  scCDG: A Method Based on DAE and GCN for scRNA-Seq Data Analysis , 2021, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[3]  S. Piro,et al.  Coffee Restores Expression of lncRNAs Involved in Steatosis and Fibrosis in a Mouse Model of NAFLD , 2021, Nutrients.

[4]  Xiangxiang Zeng,et al.  Drug repositioning based on the heterogeneous information fusion graph convolutional network , 2021, Briefings Bioinform..

[5]  Jiawei Luo,et al.  Cancer subtype identification by consensus guided graph autoencoders , 2021, Bioinform..

[6]  Hong Huang,et al.  Deep Feature Aggregation Framework Driven by Graph Convolutional Network for Scene Classification in Remote Sensing , 2021, IEEE Transactions on Neural Networks and Learning Systems.

[7]  Xinwang Liu,et al.  Multiview Subspace Clustering via Co-Training Robust Data Representation , 2021, IEEE Transactions on Neural Networks and Learning Systems.

[8]  Van Hoan Do,et al.  A generalization of t-SNE and UMAP to single-cell multimodal omics , 2021, Genome Biology.

[9]  S. Khoo,et al.  Gene expression analysis of human prostate cell lines with and without tumor metastasis suppressor CD82 , 2020, BMC Cancer.

[10]  Xinwang Liu,et al.  Feature Selective Projection with Low-Rank Embedding and Dual Laplacian Regularization , 2020, IEEE Transactions on Knowledge and Data Engineering.

[11]  Zexuan Zhu,et al.  Identification of Autistic Risk Candidate Genes and Toxic Chemicals via Multilabel Learning , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[12]  M. Galigniana,et al.  Proof that the high molecular weight immunophilin FKBP52 mediates the in vivo neuroregenerative effect of the macrolide FK506. , 2020, Biochemical pharmacology.

[13]  Shou-Jiang Gao,et al.  An oncogenic viral interferon regulatory factor upregulates CUB domain-containing protein 1 to promote angiogenesis by hijacking transcription factor lymphoid enhancer-binding factor 1 and metastasis suppressor CD82 , 2020, Cell Death & Differentiation.

[14]  R. Xiang,et al.  MGAT3-mediated glycosylation of tetraspanin CD82 at asparagine 157 suppresses ovarian cancer metastasis by inhibiting the integrin signaling pathway , 2020, Theranostics.

[15]  D. Fachinetti,et al.  From evolution to function: Two sides of the same CENP-B coin? , 2020, Experimental cell research.

[16]  T. Nagase,et al.  CENP-B creates alternative epigenetic chromatin states permissive for CENP-A or heterochromatin assembly , 2020, Journal of Cell Science.

[17]  Dinggang Shen,et al.  Late Fusion Incomplete Multi-View Clustering , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Yi Pan,et al.  SinNLRR: a robust subspace clustering method for cell type detection by non-negative and low-rank representation , 2019, Bioinform..

[19]  Xinwang Liu,et al.  Learning a Joint Affinity Graph for Multiview Subspace Clustering , 2019, IEEE Transactions on Multimedia.

[20]  Rui Kuang,et al.  Machine learning and statistical methods for clustering single-cell RNA-sequencing data , 2019, Briefings Bioinform..

[21]  Fei Guo,et al.  Discovering Cancer Subtypes via an Accurate Fusion Strategy on Multiple Profile Data , 2019, Front. Genet..

[22]  Paul J. Hoffman,et al.  Comprehensive Integration of Single-Cell Data , 2018, Cell.

[23]  Hao Jiang,et al.  Single cell clustering based on cell‐pair differentiability correlation and variance analysis , 2018, Bioinform..

[24]  Jie Qiao,et al.  A single-cell RNA-seq survey of the developmental landscape of the human prefrontal cortex , 2018, Nature.

[25]  S. Orkin,et al.  Mapping the Mouse Cell Atlas by Microwell-Seq , 2018, Cell.

[26]  Chun Wang,et al.  MGAE: Marginalized Graph Autoencoder for Graph Clustering , 2017, CIKM.

[27]  Yun Fu,et al.  Entropy‐based consensus clustering for patient stratification , 2017, Bioinform..

[28]  Wei Xing Zheng,et al.  Distributed $k$ -Means Algorithm and Fuzzy $c$ -Means Algorithm for Sensor Networks Based on Multiagent Consensus Theory , 2017, IEEE Transactions on Cybernetics.

[29]  A. Murphy,et al.  RNA Sequencing of Single Human Islet Cells Reveals Type 2 Diabetes Genes. , 2016, Cell metabolism.

[30]  Bo Wang,et al.  Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning , 2016, Nature Methods.

[31]  D. Mock,et al.  Innate-like functions of natural killer T cell subsets result from highly divergent gene programs , 2016, Nature Immunology.

[32]  I. Macaulay,et al.  Single-cell RNA sequencing reveals molecular and functional platelet bias of aged haematopoietic stem cells , 2016, Nature Communications.

[33]  Shuicheng Yan,et al.  Convex Sparse Spectral Clustering: Single-View to Multi-View , 2015, IEEE Transactions on Image Processing.

[34]  Feiping Nie,et al.  A New Simplex Sparse Learning Model to Measure Data Similarity for Clustering , 2015, IJCAI.

[35]  Feiping Nie,et al.  Clustering and projected clustering with adaptive neighbors , 2014, KDD.

[36]  N. Neff,et al.  Reconstructing lineage hierarchies of the distal lung epithelium using single cell RNA-seq , 2014, Nature.

[37]  R. Sandberg,et al.  Single-Cell RNA-Seq Reveals Dynamic, Random Monoallelic Gene Expression in Mammalian Cells , 2014, Science.

[38]  Jialiang Yang,et al.  A Streamlined scRNA-Seq Data Analysis Framework Based on Improved Sparse Subspace Clustering , 2021, IEEE Access.

[39]  Amir Hajjam El Hassani,et al.  Classification models for heart disease prediction using feature selection and PCA , 2020 .

[40]  Liqiang Nie,et al.  Scalable Deep Hashing for Large-Scale Social Image Retrieval , 2020, IEEE Transactions on Image Processing.

[41]  Marina Meila,et al.  L 10 : Spectral Clustering , 2016 .

[42]  E. Binder,et al.  Gene–Stress–Epigenetic Regulation of FKBP5: Clinical and Translational Implications , 2016, Neuropsychopharmacology.

[43]  Silke Wagner,et al.  Comparing Clusterings - An Overview , 2007 .

[44]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[45]  A. Banerji Science and Engineering , 1910, Nature.