Individual-level analysis of differential expression of genes and pathways for personalized medicine

MOTIVATION The differential expression analysis focusing on inter-group comparison can capture only differentially expressed genes (DE genes) at the population level, which may mask the heterogeneity of differential expression in individuals. Thus, to provide patient-specific information for personalized medicine, it is necessary to conduct differential expression analysis at the individual level. RESULTS We proposed a method to detect DE genes in individual disease samples by using the disrupted ordering in individual disease samples. In both simulated data and real paired cancer-normal sample data, this method showed excellent performance. It was found to be insensitive to experimental batch effects and data normalization. The landscape of stable gene pairs in a particular type of normal tissue could be predetermined using previously accumulated data, based on which dysregulated genes and pathways for any disease sample can be readily detected. The usefulness of the RankComp method in clinical settings was exemplified by the identification and application of prognostic markers for lung cancer. AVAILABILITY AND IMPLEMENTATION RankComp is implemented in R script that is freely available from Supplementary Materials.

[1]  Alex E. Lash,et al.  Gene Expression Omnibus: NCBI gene expression and hybridization array data repository , 2002, Nucleic Acids Res..

[2]  Yanchun Liang,et al.  Weighted Change-Point Method for Detecting Differential Gene Expression in Breast Cancer Microarray Data , 2012, PloS one.

[3]  Li-Jen Su,et al.  Protein arginine methyltransferase 5 is a potential oncoprotein that upregulates G1 cyclins/cyclin‐dependent kinases and the phosphoinositide 3‐kinase/AKT signaling cascade , 2012, Cancer science.

[4]  Bin Li,et al.  Extensive increase of microarray signals in cancers calls for novel normalization assumptions , 2011, Comput. Biol. Chem..

[5]  R. Gray A Class of $K$-Sample Tests for Comparing the Cumulative Incidence of a Competing Risk , 1988 .

[6]  Mitchell J. Mergenthaler Nonparametrics: Statistical Methods Based on Ranks , 1979 .

[7]  Mats Lambe,et al.  Biomarker Discovery in Non–Small Cell Lung Cancer: Integrating Gene Expression Profiling, Meta-analysis, and Tissue Microarray Validation , 2012, Clinical Cancer Research.

[8]  L. Hawthorn,et al.  Integration of transcript expression, copy number and LOH analysis of infiltrating ductal carcinoma of the breast , 2010, BMC Cancer.

[9]  Rainer Breitling,et al.  Rank products: a simple, yet powerful, new method to detect differentially regulated genes in replicated microarray experiments , 2004, FEBS letters.

[10]  W. V. van IJcken,et al.  Gene Expression-Based Classification of Non-Small Cell Lung Carcinomas and Survival Prediction , 2010, PloS one.

[11]  Ilya Shmulevich,et al.  Gene pair signatures in cell type transcriptomes reveal lineage control , 2013, Nature Methods.

[12]  David M. Simcha,et al.  Tackling the widespread and critical impact of batch effects in high-throughput data , 2010, Nature Reviews Genetics.

[13]  Israel Steinfeld,et al.  Novel Rank-Based Statistical Methods Reveal MicroRNAs with Differential Expression in Multiple Cancer Types , 2009, PloS one.

[14]  Dimitris Kletsas,et al.  Oncogene-induced senescence is part of the tumorigenesis barrier imposed by DNA damage checkpoints , 2006, Nature.

[15]  Lang He,et al.  Revealing weak differential gene expressions and their reproducible functions associated with breast cancer metastasis , 2012, Comput. Biol. Chem..

[16]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[17]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[18]  J. Tchinda,et al.  Recurrent fusion of TMPRSS2 and ETS transcription factor genes in prostate cancer. , 2006, Science.

[19]  E. Kaplan,et al.  Nonparametric Estimation from Incomplete Observations , 1958 .

[20]  Chuhsing Kate Hsiao,et al.  Identification of a Novel Biomarker, SEMA5A, for Non–Small Cell Lung Carcinoma in Nonsmoking Women , 2010, Cancer Epidemiology, Biomarkers & Prevention.

[21]  Doulaye Dembélé,et al.  Fold change rank ordering statistics: a new method for detecting differentially expressed genes , 2014, BMC Bioinformatics.

[22]  Baolin Wu,et al.  Cancer outlier differential gene expression detection. , 2007, Biostatistics.

[23]  Laura Lentini,et al.  CENPA overexpression promotes genome instability in pRb-depleted human cells , 2009, Molecular Cancer.

[24]  R. Tibshirani,et al.  Outlier sums for differential gene expression analysis. , 2007, Biostatistics.

[25]  D.,et al.  Regression Models and Life-Tables , 2022 .

[26]  K. Coombes,et al.  Robust Gene Expression Signature from Formalin-Fixed Paraffin-Embedded Samples Predicts Prognosis of Non–Small-Cell Lung Cancer Patients , 2011, Clinical Cancer Research.

[27]  Hugues Bersini,et al.  Batch effect removal methods for microarray gene expression data integration: a survey , 2013, Briefings Bioinform..

[28]  Jing Zhu,et al.  Gaining confidence in biological interpretation of the microarray data: the functional consistence of the significant GO categories , 2008, Bioinform..

[29]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[30]  Zheng Guo,et al.  Separate enrichment analysis of pathways for up- and downregulated genes , 2014, Journal of The Royal Society Interface.

[31]  Xavier Estivill,et al.  Gene expression signatures in breast cancer distinguish phenotype characteristics, histologic subtypes, and tumor invasiveness , 2010, Cancer.

[32]  Greg Tucker-Kellogg,et al.  A Comparison of Methods for Data-Driven Cancer Outlier Discovery, and An Application Scheme to Semisupervised Predictive Biomarker Discovery , 2011, Cancer informatics.

[33]  H. Toutenburg,et al.  Lehmann, E. L., Nonparametrics: Statistical Methods Based on Ranks, San Francisco. Holden‐Day, Inc., 1975. 480 S., $ 22.95 . , 1977 .

[34]  Elisabeth Brambilla,et al.  Ectopic Activation of Germline and Placental Genes Identifies Aggressive Metastasis-Prone Lung Cancers , 2013, Science Translational Medicine.

[35]  Gordon K Smyth,et al.  Statistical Applications in Genetics and Molecular Biology Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments , 2011 .

[36]  E. Lehmann,et al.  Nonparametrics: Statistical Methods Based on Ranks , 1976 .

[37]  L. O’Driscoll,et al.  Correlating transcriptional networks to breast cancer survival: a large-scale coexpression analysis. , 2013, Carcinogenesis.

[38]  Daniel Q. Naiman,et al.  Simple decision rules for classifying human cancers from gene expression profiles , 2005, Bioinform..

[39]  Guillem Rigaill,et al.  Identifying subgroup markers in heterogeneous populations , 2013, Nucleic acids research.

[40]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[41]  H. Lian MOST: detecting cancer differential gene expression. , 2007, Biostatistics.

[42]  Pier Paolo Pandolfi,et al.  Subtle variations in Pten dose determine cancer susceptibility , 2010, Nature Genetics.

[43]  Rafael Rosell,et al.  Gene expression profiling reveals novel biomarkers in nonsmall cell lung cancer , 2011, International journal of cancer.

[44]  J. Mazières,et al.  DNA replication stress response involving PLK1, CDC6, POLQ, RAD51 and CLASPIN upregulation prognoses the outcome of early/mid-stage non-small cell lung cancer patients , 2012, Oncogenesis.

[45]  Paolo Toniolo,et al.  Pregnancy‐induced chromatin remodeling in the breast of postmenopausal women , 2012, International journal of cancer.

[46]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[47]  F. Harrell,et al.  Prognostic/Clinical Prediction Models: Multivariable Prognostic Models: Issues in Developing Models, Evaluating Assumptions and Adequacy, and Measuring and Reducing Errors , 2005 .

[48]  Daniel Q. Naiman,et al.  Classifying Gene Expression Profiles from Pairwise mRNA Comparisons , 2004, Statistical applications in genetics and molecular biology.

[49]  Renee Rubio,et al.  Proliferative genes dominate malignancy-risk gene signature in histologically-normal breast tissue , 2009, Breast Cancer Research and Treatment.

[50]  Peter M Schlag,et al.  Identification of early molecular markers for breast cancer , 2011, Molecular Cancer.

[51]  Shridar Ganesan,et al.  X chromosomal abnormalities in basal-like human breast cancer. , 2006, Cancer cell.

[52]  Jianhua Hu,et al.  Cancer outlier detection based on likelihood ratio test , 2008, Bioinform..

[53]  D. Wolfe,et al.  Nonparametric Statistical Methods. , 1974 .

[54]  Hiroaki Kitano,et al.  Large-Scale Analysis of Network Bistability for Human Cancers , 2010, PLoS Comput. Biol..

[55]  D. Cox Regression Models and Life-Tables , 1972 .

[56]  Satoru Miyano,et al.  Identification of genes upregulated in ALK-positive and EGFR/KRAS/ALK-negative lung adenocarcinomas. , 2012, Cancer research.

[57]  Benjamin Haibe-Kains,et al.  DNA methylation profiling reveals a predominant immune component in breast cancers , 2011, EMBO molecular medicine.