RCRnorm: An integrated system of random-coefficient hierarchical regression models for normalizing NanoString nCounter data.

Formalin-fixed paraffin-embedded (FFPE) samples have great potential for biomarker discovery, retrospective studies, and diagnosis or prognosis of diseases. Their application, however, is hindered by the unsatisfactory performance of traditional gene expression profiling techniques on damaged RNAs. NanoString nCounter platform is well suited for profiling of FFPE samples and measures gene expression with high sensitivity, which may greatly facilitate realization of scientific and clinical values of FFPE samples. However, methodological development for normalization, a critical step when analyzing this type of data, is far behind. Existing methods designed for the platform use information from different types of internal controls separately and rely on an overly-simplified assumption that expression of housekeeping genes is constant across samples for global scaling. Thus, these methods are not optimized for the nCounter system, not mentioning that they were not developed for FFPE samples. We construct an integrated system of random-coefficient hierarchical regression models to capture main patterns and characteristics observed from NanoString data of FFPE samples, and develop a Bayesian approach to estimate parameters and normalize gene expression across samples. Our method, labeled RCRnorm, incorporates information from all aspects of the experimental design and simultaneously removes biases from various sources. It eliminates the unrealistic assumption on housekeeping genes and offers great interpretability. Furthermore, it is applicable to freshly frozen or like samples that can be generally viewed as a reduced case of FFPE samples. Simulation and applications showed the superior performance of RCRnorm.

[1]  S. Shapiro,et al.  An Analysis of Variance Test for Normality (Complete Samples) , 1965 .

[2]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[3]  M. Monden,et al.  Analysis of chemical modification of RNA from formalin-fixed samples and optimization of molecular biology applications for such samples. , 1999, Nucleic acids research.

[4]  B. A. Keel Clinical laboratory improvement amendments of 1988 (CLIA '88): A review , 2000 .

[5]  C. Li,et al.  Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[7]  J. Gillespie,et al.  Comparison of snap freezing versus ethanol fixation for gene expression profiling of tissue specimens. , 2004, The Journal of molecular diagnostics : JMD.

[8]  Alimuddin Zumla,et al.  Validation of housekeeping genes for normalizing RNA expression in real-time PCR , 2004 .

[9]  J. Weinstein,et al.  Biomarkers in Cancer Staging, Prognosis and Treatment Selection , 2005, Nature Reviews Cancer.

[10]  G. Weinstein,et al.  Identification of a Gene Signature for Rapid Screening of Oral Squamous Cell Carcinoma , 2006, Clinical Cancer Research.

[11]  S. Shapiro,et al.  An analysis of variance test for normality ( complete samp 1 es ) t , 2007 .

[12]  Silke von Ahlfen,et al.  Determinants of RNA Quality from FFPE Samples , 2007, PloS one.

[13]  Jeremy J. W. Chen,et al.  A five-gene signature and clinical outcome in non-small-cell lung cancer. , 2007, The New England journal of medicine.

[14]  O. Hurtado,et al.  Validation of housekeeping genes for quantitative real-time PCR in in-vivo and in-vitro models of cerebral ischaemia , 2009, BMC Molecular Biology.

[15]  R. Aharonov,et al.  MicroRNAs accurately identify cancer tissue origin , 2008, Nature Biotechnology.

[16]  J. Shao,et al.  The Jackknife Estimate of Variance , 2008 .

[17]  Jennifer L. Osborn,et al.  Direct multiplexed measurement of gene expression with color-coded probe pairs , 2008, Nature Biotechnology.

[18]  N. Hanna A Five-Gene Signature and Clinical Outcome in Non–Small-Cell Lung Cancer , 2008 .

[19]  Thomas E. Royce,et al.  Whole-Genome Gene Expression Profiling of Formalin-Fixed, Paraffin-Embedded Tissue Samples , 2009, PloS one.

[20]  Unger Stefan,et al.  Effects of three different preservation methods on the mechanical properties of human and bovine cortical bone. , 2010 .

[21]  T. Triche,et al.  Quantitative expression profiling in formalin-fixed paraffin-embedded samples by affymetrix microarrays. , 2010, The Journal of molecular diagnostics : JMD.

[22]  L. Waldron,et al.  mRNA transcript quantification in archival samples using multiplexed, color-coded probes , 2011, BMC biotechnology.

[23]  Meghana Kulkarni Digital multiplexed gene expression analysis using the NanoString nCounter system. , 2011, Current protocols in molecular biology.

[24]  K. Coombes,et al.  Robust Gene Expression Signature from Formalin-Fixed Paraffin-Embedded Samples Predicts Prognosis of Non–Small-Cell Lung Cancer Patients , 2011, Clinical Cancer Research.

[25]  J. Solassol,et al.  KRAS Mutation Detection in Paired Frozen and Formalin-Fixed Paraffin-Embedded (FFPE) Colorectal Cancer Tissues , 2011, International journal of molecular sciences.

[26]  Paul C. Boutros,et al.  NanoStringNorm: an extensible R package for the pre-processing of NanoString mRNA and miRNA data , 2012, Bioinform..

[27]  J. Minna,et al.  A 12-Gene Set Predicts Survival Benefits from Adjuvant Chemotherapy in Non–Small Cell Lung Cancer Patients , 2013, Clinical Cancer Research.

[28]  E. Levanon,et al.  Human housekeeping genes, revisited. , 2013, Trends in genetics : TIG.

[29]  Charity W. Law,et al.  voom: precision weights unlock linear model analysis tools for RNA-seq read counts , 2014, Genome Biology.

[30]  S. Dudoit,et al.  Normalization of RNA-seq data using factor analysis of control genes or samples , 2014, Nature Biotechnology.

[31]  C. Caldas,et al.  Reliable gene expression profiling of formalin-fixed paraffin-embedded breast cancer tissue (FFPE) using cDNA-mediated annealing, extension, selection, and ligation whole-genome (DASL WG) assay , 2016, BMC Medical Genomics.

[32]  T. Yeatman,et al.  Adaptation of a RAS pathway activation signature from FF to FFPE tissues in colorectal cancer , 2016, BMC Medical Genomics.

[33]  Florenza Lüder Ripoli,et al.  A Comparison of Fresh Frozen vs. Formalin-Fixed, Paraffin-Embedded Specimens of Canine Mammary Tumors via Branched-DNA Assay , 2016, International journal of molecular sciences.

[34]  A. Stromberg,et al.  NanoStringDiff: a novel statistical method for differential expression analysis based on NanoString nCounter data , 2016, Nucleic acids research.

[35]  C. Morrison,et al.  Robust detection of immune transcripts in FFPE samples using targeted RNA sequencing , 2016, Oncotarget.

[36]  Sandrine Dudoit,et al.  Normalizing single-cell RNA sequencing data: challenges and opportunities , 2017, Nature Methods.

[37]  J. Minna,et al.  Validation of the 12-gene Predictive Signature for Adjuvant Chemotherapy Response in Lung Cancer , 2018, Clinical Cancer Research.