Expression analysis of RNA sequencing data from human neural and glial cell lines depends on technical replication and normalization methods

BackgroundThe potential for astrocyte participation in central nervous system recovery is highlighted by in vitro experiments demonstrating their capacity to transdifferentiate into neurons. Understanding astrocyte plasticity could be advanced by comparing astrocytes with stem cells. RNA sequencing (RNA-seq) is ideal for comparing differences across cell types. However, this novel multi-stage process has the potential to introduce unwanted technical variation at several points in the experimental workflow. Quantitative understanding of the contribution of experimental parameters to technical variation would facilitate the design of robust RNA-Seq experiments.ResultsRNA-Seq was used to achieve biological and technical objectives. The biological aspect compared gene expression between normal human fetal-derived astrocytes and human neural stem cells cultured in identical conditions. When differential expression threshold criteria of |log2fold change| > 2 were applied to the data, no significant differences were observed. The technical component quantified variation arising from particular steps in the research pathway, and compared the ability of different normalization methods to reduce unwanted variance. To facilitate this objective, a liberal false discovery rate of 10% and a |log2fold change| > 0.5 were implemented for the differential expression threshold. Data were normalized with RPKM, TMM, and UQS methods using JMP Genomics. The contributions of key replicable experimental parameters (cell lot; library preparation; flow cell) to variance in the data were evaluated using principal variance component analysis. Our analysis showed that, although the variance for every parameter is strongly influenced by the normalization method, the largest contributor to technical variance was library preparation. The ability to detect differentially expressed genes was also affected by normalization; differences were only detected in non-normalized and TMM-normalized data.ConclusionsThe similarity in gene expression between astrocytes and neural stem cells supports the potential for astrocytic transdifferentiation into neurons, and emphasizes the need to evaluate the therapeutic potential of astrocytes for central nervous system damage. The choice of normalization method influences the contributions to experimental variance as well as the outcomes of differential expression analysis. However irrespective of normalization method, our findings illustrate that library preparation contributed the largest component of technical variance.

[1]  Qunhua Li,et al.  A semi-parametric statistical model for integrating gene expression profiles across different platforms , 2016, BMC Bioinformatics.

[2]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[3]  P. Wincker,et al.  Comparison of library preparation methods reveals their impact on interpretation of metatranscriptomic data , 2014, BMC Genomics.

[4]  Pavel Senin,et al.  Comparison of normalization methods for differential gene expression analysis in RNA-Seq experiments , 2013, Communicative & integrative biology.

[5]  Kim M. Summers,et al.  Integration of quantitated expression estimates from polyA-selected and rRNA-depleted RNA-seq libraries , 2017, BMC Bioinformatics.

[6]  Christoph Endrullat,et al.  Standardization and quality management in next-generation sequencing , 2016, Applied & translational genomics.

[7]  J. Knowles,et al.  Effect of RNA integrity on uniquely mapped reads in RNA-Seq , 2014, BMC Research Notes.

[8]  V. Bleu Knight,et al.  RNA Sequencing Analysis of Neural Cell Lines: Impact of Normalization and Technical Replication , 2017, IWBBIO.

[9]  John Quackenbush,et al.  Sources of variation in baseline gene expression levels from toxicogenomics study control animals across multiple laboratories , 2008, BMC Genomics.

[10]  M. Buck,et al.  An automated method for efficient, accurate and reproducible construction of RNA-seq libraries , 2015, BMC Research Notes.

[11]  Christopher J. Martyniuk,et al.  Optimal alpha reduces error rates in gene expression studies: a meta-analysis approach , 2017, BMC Bioinformatics.

[12]  Sandrine Dudoit,et al.  Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments , 2010, BMC Bioinformatics.

[13]  Magdalena Götz,et al.  Reactive astrocytes as neural stem or progenitor cells: In vivo lineage, In vitro potential, and Genome‐wide expression analysis , 2015, Glia.

[14]  B. Oliver,et al.  Comparison of normalization and differential expression analyses using RNA-Seq data from 726 individual Drosophila melanogaster , 2016, BMC Genomics.

[15]  R. Tibshirani,et al.  Normalization, testing, and false discovery rate estimation for RNA-sequencing data. , 2012, Biostatistics.

[16]  Bugra Ozer,et al.  A novel analysis strategy for integrating methylation and expression data reveals core pathways for thyroid cancer aetiology , 2015, BMC Genomics.

[17]  David P. Kreil,et al.  A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control consortium , 2014, Nature Biotechnology.

[18]  Guy N. Brock,et al.  Power analysis for RNA-Seq differential expression studies , 2017, BMC Bioinformatics.

[19]  Jonas Frisén,et al.  A latent neurogenic program in astrocytes regulated by Notch signaling in the mouse , 2014, Science.

[20]  T. Therneau,et al.  Technical and biological variance structure in mRNA-Seq data: life in the real world , 2012, BMC Genomics.

[21]  Minoru Kanehisa,et al.  KEGG as a reference resource for gene and protein annotation , 2015, Nucleic Acids Res..

[22]  Michael B. Black,et al.  IVT-seq reveals extreme bias in RNA sequencing , 2014, Genome Biology.

[23]  T. Chu,et al.  Principal Variance Components Analysis: Estimating Batch Effects in Microarray Gene Expression Data , 2009 .

[24]  Aviv Regev,et al.  Comprehensive comparative analysis of RNA sequencing methods for degraded or low input samples , 2013, Nature Methods.

[25]  E. Serrano,et al.  Hydrogel scaffolds promote neural gene expression and structural reorganization in human astrocyte cultures , 2017, PeerJ.

[26]  Daniel J. Gaffney,et al.  A survey of best practices for RNA-seq data analysis , 2016, Genome Biology.

[27]  A. Mortazavi,et al.  Technical considerations for functional sequencing assays , 2012, Nature Immunology.

[28]  S. Dudoit,et al.  Normalization of RNA-seq data using factor analysis of control genes or samples , 2014, Nature Biotechnology.

[29]  Daniel Shriner,et al.  Moving toward System Genetics through Multiple Trait Analysis in Genome-Wide Association Studies , 2011, Front. Gene..

[30]  Anastasia G. Efthymiou,et al.  Comparison of the Gene Expression Profiles of Human Fetal Cortical Astrocytes with Pluripotent Stem Cell Derived Neural Stem Cells Identifies Human Astrocyte Markers and Signaling Pathways and Transcription Factors Active in Human Astrocytes , 2014, PloS one.

[31]  Nicolas Servant,et al.  A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis , 2013, Briefings Bioinform..

[32]  Kimberly R. Kukurba,et al.  RNA Sequencing and Analysis. , 2015, Cold Spring Harbor protocols.

[33]  J. Zyprych-Walczak,et al.  The Impact of Normalization Methods on RNA-Seq Data Analysis , 2015, BioMed research international.

[34]  B. Kryńska,et al.  Derivation of neuronal cells from fetal normal human astrocytes (NHA). , 2013, Methods in molecular biology.

[35]  Steven R. Head,et al.  Technical Variations in Low-Input RNA-seq Methodologies , 2014, Scientific Reports.

[36]  Robert C. Thompson,et al.  NGSQC: cross-platform quality analysis pipeline for deep sequencing data , 2010, BMC Genomics.

[37]  B. Martin,et al.  Task-Specific and General Cognitive Effects in Chiari Malformation Type I , 2014, PloS one.

[38]  Jeffrey L. Spees,et al.  Self-Renewal and Differentiation of Reactive Astrocyte-Derived Neural Stem/Progenitor Cells Isolated from the Cortical Peri-Infarct Area after Stroke , 2012, The Journal of Neuroscience.

[39]  Jinhee Kim,et al.  Effect of Normalization on Statistical and Biological Interpretation of Gene Expression Profiles , 2013, Front. Genet..

[40]  Susan R. Wilson,et al.  Efficient experimental design and analysis strategies for the detection of differential expression using RNA-Sequencing , 2012, BMC Genomics.

[41]  Davide Heller,et al.  STRING v10: protein–protein interaction networks, integrated over the tree of life , 2014, Nucleic Acids Res..

[42]  E. Serrano,et al.  Post-Translational Tubulin Modifications in Human Astrocyte Cultures , 2017, Neurochemical Research.

[43]  B. Rost,et al.  Disease-related mutations predicted to impact protein function , 2012, BMC Genomics.

[44]  Connie R. Jimenez,et al.  An accurate paired sample test for count data , 2012, Bioinform..