ESR: Optimizing Gene Feature Selection for scRNA-seq Data

The rapid development of single-cell RNA sequencing (scRNA-seq) technology has enabled researchers to explore gene expression differences at the level of individual cells, revealing more refined cell types and states. However, due to the low expression and high noise of scRNA-seq data, feature selection has become particularly important in the analysis of single-cell data. Here, we introduce the Entropy Stepwise Regression (ESR) method for feature selection. This method utilizes the correlation between genes and the entropy values of each feature to filter out genes that are conducive to downstream analysis. In mouse kidney samples, we compared the performance of three methods in terms of Adjusted Rand Index and achieved good results. This indicates that the method can improve the accuracy of downstream analysis.

[1]  Ka-chun Wong,et al.  scEFSC: Accurate single-cell RNA-seq data analysis via ensemble consensus clustering based on multiple feature selections , 2022, Computational and structural biotechnology journal.

[2]  Luonan Chen,et al.  Intrinsic entropy model for feature selection of scRNA-seq data , 2022, Journal of molecular cell biology.

[3]  N. A. Rayan,et al.  DUBStepR is a scalable correlation-based feature selection method for accurately clustering single-cell data , 2021, Nature Communications.

[4]  W. Li,et al.  Selecting gene features for unsupervised analysis of single-cell gene expression data. , 2021, Briefings in bioinformatics.

[5]  Sanghamitra Bandyopadhyay,et al.  sc-REnF: An entropy guided robust feature selection for single-cell RNA-seq data , 2021, Briefings Bioinform..

[6]  Tianwei Yu,et al.  Accurate feature selection improves single-cell RNA-seq cell clustering , 2021, Briefings Bioinform..

[7]  Bonnie Berger,et al.  Computational Methods for Single-Cell RNA Sequencing , 2020 .

[8]  Irving L. Weissman,et al.  A single-cell transcriptomic atlas characterizes ageing tissues in the mouse , 2020, Nature.

[9]  Dan Zhang,et al.  Construction of a human cell landscape at single-cell level , 2020, Nature.

[10]  Irving L. Weissman,et al.  A molecular cell atlas of the human lung from single cell RNA sequencing , 2019, Nature.

[11]  Pak Chung Sham,et al.  Evaluation of tools for highly variable gene discovery from single-cell RNA-seq data , 2019, Briefings Bioinform..

[12]  Dominic Grün,et al.  A Human Liver Cell Atlas reveals Heterogeneity and Epithelial Progenitors , 2019, Nature.

[13]  Samantha A. Morris,et al.  Single-cell mapping of lineage and identity in direct reprogramming , 2018, Nature.

[14]  Douglas A. Lauffenburger,et al.  Analysis of Single-Cell RNA-Seq Identifies Cell-Cell Communication Associated with Tumor Characteristics , 2018, Cell reports.

[15]  J. Lee,et al.  Single-cell RNA sequencing technologies and bioinformatics pipelines , 2018, Experimental & Molecular Medicine.

[16]  S. Potter,et al.  Single-cell RNA sequencing for the study of development, physiology and disease , 2018, Nature Reviews Nephrology.

[17]  Paul Hoffman,et al.  Integrating single-cell transcriptomic data across different conditions, technologies, and species , 2018, Nature Biotechnology.

[18]  A. Regev,et al.  Revealing the vectors of cellular identity with single-cell genomics , 2016, Nature Biotechnology.

[19]  Guocheng Yuan,et al.  GiniClust: detecting rare cell types from single-cell gene expression data with Gini index , 2016, Genome Biology.

[20]  Aleksandra A. Kolodziejczyk,et al.  The technology and biology of single-cell RNA sequencing. , 2015, Molecular cell.

[21]  Johan Paulsson,et al.  Separating intrinsic from extrinsic fluctuations in dynamic biological systems , 2011, Proceedings of the National Academy of Sciences.

[22]  S. Linnarsson,et al.  Counting absolute numbers of molecules using unique molecular identifiers , 2011, Nature Methods.

[23]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[24]  OUP accepted manuscript , 2022, Journal of Molecular Cell Biology.

[25]  Peng Jiang,et al.  Quality Control of Single-Cell RNA-seq. , 2019, Methods in molecular biology.

[26]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .