论文信息 - PanglaoDB: a web server for exploration of mouse and human single-cell RNA sequencing data - 字舞流文

PanglaoDB: a web server for exploration of mouse and human single-cell RNA sequencing data

Abstract Single-cell RNA sequencing is an increasingly used method to measure gene expression at the single cell level and build cell-type atlases of tissues. Hundreds of single-cell sequencing datasets have already been published. However, studies are frequently deposited as raw data, a format difficult to access for biological researchers due to the need for data processing using complex computational pipelines. We have implemented an online database, PanglaoDB, accessible through a user-friendly interface that can be used to explore published mouse and human single cell RNA sequencing studies. PanglaoDB contains pre-processed and pre-computed analyses from more than 1054 single-cell experiments covering most major single cell platforms and protocols, based on more than 4 million cells from a wide range of tissues and organs. The online interface allows users to query and explore cell types, genetic pathways and regulatory networks. In addition, we have established a community-curated cell-type marker compendium, containing more than 6000 gene-cell-type associations, as a resource for automatic annotation of cell types.

Oscar Franzén | Li-Ming Gan | Johan L. M. Björkegren | J. Björkegren | L. Gan | Oscar Franzén

[1] Steve Horvath,et al. WGCNA: an R package for weighted correlation network analysis , 2008, BMC Bioinformatics.

[2] Sorin Draghici,et al. Down-weighting overlapping genes improves gene set analysis , 2012, BMC Bioinformatics.

[3] William R Sellers,et al. Maintenance of adenomatous polyposis coli (APC)-mutant colorectal cancer is dependent on Wnt/β-catenin signaling , 2011, Proceedings of the National Academy of Sciences.

[4] J. Mesirov,et al. The Molecular Signatures Database Hallmark Gene Set Collection , 2015 .

[5] Aleksandra A. Kolodziejczyk,et al. The technology and biology of single-cell RNA sequencing. , 2015, Molecular cell.

[6] Evan Z. Macosko,et al. Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets , 2015, Cell.

[7] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .

[8] Paul Hoffman,et al. Integrating single-cell transcriptomic data across different conditions, technologies, and species , 2018, Nature Biotechnology.

[9] Debra L. Fulton,et al. TFCat: the curated catalog of mouse and human transcription factors , 2009, Genome Biology.

[10] Edwin Cuppen,et al. Sambamba: fast processing of NGS alignment formats , 2015, Bioinform..

[11] Zhongming Zhao,et al. scRNASeqDB: A Database for RNA-Seq Based Gene Expression Profiles in Human Single Cells , 2017, Genes.

[12] Gonçalo R. Abecasis,et al. The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[13] J. Mesirov,et al. The Molecular Signatures Database (MSigDB) hallmark gene set collection. , 2015, Cell systems.

[14] Castrense Savojardo,et al. eDGAR: a database of Disease-Gene Associations with annotated Relationships among genes , 2017, BMC Genomics.

[15] Yves Moreau,et al. GRNBoost2 and Arboreto: efficient and scalable inference of gene regulatory networks , 2018, Bioinform..

[16] Rhonda Bacher,et al. Design and computational analysis of single-cell RNA-sequencing experiments , 2016, Genome Biology.

[17] J. Aerts,et al. SCENIC: Single-cell regulatory network inference and clustering , 2017, Nature Methods.

[18] Mikhail Pachkov,et al. SwissRegulon, a database of genome-wide annotations of regulatory sites: recent updates , 2012, Nucleic Acids Res..

[19] Stein Aerts,et al. iRegulon: From a Gene List to a Gene Regulatory Network Using Large Motif and Track Collections , 2014, PLoS Comput. Biol..

[20] Steven L Salzberg,et al. HISAT: a fast spliced aligner with low memory requirements , 2015, Nature Methods.

[21] Gioele La Manno,et al. Quantitative single-cell RNA-seq with unique molecular identifiers , 2013, Nature Methods.

[22] Hiroshi Tanaka,et al. FANTOM5 CAGE profiles of human and mouse samples , 2017, Scientific Data.

[23] William Stafford Noble,et al. FIMO: scanning for occurrences of a given motif , 2011, Bioinform..

[24] W. Shi,et al. The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote , 2013, Nucleic acids research.

[25] Sachi Kato,et al. SCPortalen: human and mouse single-cell centric database , 2017, Nucleic Acids Res..

[26] Alex E. Lash,et al. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository , 2002, Nucleic Acids Res..

[27] Kate B. Cook,et al. Determination and Inference of Eukaryotic Transcription Factor Sequence Specificity , 2014, Cell.

[28] Feng Li,et al. CellMarker: a manually curated resource of cell markers in human and mouse , 2018, Nucleic Acids Res..

[29] N. Jayaram,et al. Evaluating tools for transcription factor binding site prediction , 2016, BMC Bioinformatics.

[30] P. Geurts,et al. Inferring Regulatory Networks from Expression Data Using Tree-Based Methods , 2010, PloS one.

[31] Vladimir B. Bajic,et al. HOCOMOCO: expansion and enhancement of the collection of transcription factor binding sites models , 2015, Nucleic Acids Res..

[32] Åsa K. Björklund,et al. Full-length RNA-seq from single cells using Smart-seq2 , 2014, Nature Protocols.

[33] Y. Benjamini,et al. Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[34] A. Visel,et al. Homotypic clusters of transcription factor binding sites are a key component of human promoters and enhancers. , 2010, Genome research.

[35] Bronwen L. Aken,et al. GENCODE: The reference human genome annotation for The ENCODE Project , 2012, Genome research.

[36] Leland McInnes,et al. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction , 2018, ArXiv.

[37] Davis J. McCarthy,et al. A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor , 2016, F1000Research.

[38] J. Ernst,et al. Cooperative Binding of Transcription Factors Orchestrates Reprogramming , 2017, Cell.

[39] David J. Arenillas,et al. JASPAR 2018: update of the open-access database of transcription factor binding profiles and its web framework , 2017, Nucleic acids research.

[40] A. Heger,et al. UMI-tools: modeling sequencing errors in Unique Molecular Identifiers to improve quantification accuracy , 2016, bioRxiv.

[41] Martha L. Bulyk,et al. UniPROBE: an online database of protein binding microarray data on protein–DNA interactions , 2008, Nucleic Acids Res..

[42] Hideaki Sugawara,et al. The Sequence Read Archive , 2010, Nucleic Acids Res..