Big data and computational biology strategy for personalized prognosis

The era of big data and precision medicine has led to accumulation of massive datasets of gene expression data and clinical information of patients. For a new patient, we propose that identification of a highly similar reference patient from an existing patient database via similarity matching of both clinical and expression data could be useful for predicting the prognostic risk or therapeutic efficacy. Here, we propose a novel methodology to predict disease/treatment outcome via analysis of the similarity between any pair of patients who are each characterized by a certain set of pre-defined biological variables (biomarkers or clinical features) represented initially as a prognostic binary variable vector (PBVV) and subsequently transformed to a prognostic signature vector (PSV). Our analyses revealed that Euclidean distance rather correlation distance measure was effective in defining an unbiased similarity measure calculated between two PSVs. We implemented our methods to high-grade serous ovarian cancer (HGSC) based on a 36-mRNA predictor that was previously shown to stratify patients into 3 distinct prognostic subgroups. We studied and revealed that patient's age, when converted into binary variable, was positively correlated with the overall risk of succumbing to the disease. When applied to an independent testing dataset, the inclusion of age into the molecular predictor provided more robust personalized prognosis of overall survival correlated with the therapeutic response of HGSC and provided benefit for treatment targeting of the tumors in HGSC patients. Finally, our method can be generalized and implemented in many other diseases to accurately predict personalized patients’ outcomes.

[1]  Geoffrey S Ginsburg,et al.  Personalized medicine: progress and promise. , 2011, Annual review of genomics and human genetics.

[2]  S. Panchanathan,et al.  BEST: a novel computational approach for comparing gene expression patterns from early stages of Drosophila melanogaster development. , 2002, Genetics.

[3]  Vladimir A. Kuznetsov,et al.  Sense-antisense gene-pairs in breast cancer and associated pathological pathways , 2015, Oncotarget.

[4]  R. Ozols,et al.  Prognostic factors for stage III epithelial ovarian cancer: a Gynecologic Oncology Group Study. , 2007, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[5]  Hongyu Zhao,et al.  Pathway analysis using random forests with bivariate node-split for survival outcomes , 2010, Bioinform..

[6]  D. Longo,et al.  Precision medicine--personalized, problematic, and promising. , 2015, The New England journal of medicine.

[7]  K. Cibulskis,et al.  Prognostically relevant gene signatures of high-grade serous ovarian carcinoma. , 2012, The Journal of clinical investigation.

[8]  V. Kuznetsov,et al.  Data-driven approach to predict survival of cancer patients , 2009, IEEE Engineering in Medicine and Biology Magazine.

[9]  L. Ries,et al.  Ovarian cancer: Survival and treatment differences by age , 2010, Cancer.

[10]  V. Kuznetsov Statistics of the Numbers of Transcripts and Protein Sequences Encoded in the Genome , 2003 .

[11]  Herbert Pang,et al.  Pathway-based identification of SNPs predictive of survival , 2011, European Journal of Human Genetics.

[12]  M. Cronin,et al.  A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. , 2004, The New England journal of medicine.

[13]  Shih-Fu Chang,et al.  Automated binary texture feature sets for image retrieval , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[14]  M. West,et al.  Embracing the complexity of genomic data for personalized medicine. , 2006, Genome research.

[15]  Holly Dressman,et al.  Towards integrated clinico-genomic models for personalized medicine: combining gene expression signatures and clinical factors in breast cancer outcomes prediction. , 2003, Human molecular genetics.

[16]  Jaeyun Sung,et al.  Relative Expression Analysis for Molecular Cancer Diagnosis and Prognosis , 2010, Technology in cancer research & treatment.

[17]  E. Mardis,et al.  A 50-Gene Intrinsic Subtype Classifier for Prognosis and Prediction of Benefit from Adjuvant Tamoxifen , 2012, Clinical Cancer Research.

[18]  M. Blagosklonny Rapalogs in cancer prevention , 2012, Cancer biology & therapy.

[19]  Vladimir A. Kuznetsov,et al.  Low- and high- agressive genetic breast cancer subtypes and significant survival gene signatures , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[20]  R. Tothill,et al.  Novel Molecular Subtypes of Serous and Endometrioid Ovarian Cancer Linked to Clinical Outcome , 2008, Clinical Cancer Research.

[21]  Jürgen Wolf,et al.  CASPAR: a hierarchical Bayesian approach to predict survival times in cancer from gene expression data , 2006, Bioinform..

[22]  John M. Barnard,et al.  Chemical Similarity Searching , 1998, J. Chem. Inf. Comput. Sci..

[23]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[24]  Kenji Ikeda,et al.  Gene expression in fixed tissues and outcome in hepatocellular carcinoma. , 2008, The New England journal of medicine.

[25]  Sargur N. Srihari,et al.  Binary Vector Dissimilarity Measures for Handwriting Identification , 2003, IS&T/SPIE Electronic Imaging.

[26]  Lance D. Miller,et al.  Identifying gene expression changes in breast cancer that distinguish early and late relapse among uncured patients , 2006, Bioinform..

[27]  Vladimir A Kuznetsov,et al.  Meta‐analysis of transcriptome reveals let‐7b as an unfavorable prognostic biomarker and predicts molecular and clinical subclasses in high‐grade serous ovarian carcinoma , 2014, International journal of cancer.

[28]  Eun Sung Park,et al.  Development and Validation of a Prognostic Gene-Expression Signature for Lung Adenocarcinoma , 2012, PloS one.

[29]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[30]  M. Blagosklonny Selective anti-cancer agents as anti-aging drugs , 2013, Cancer biology & therapy.

[31]  Anne-Laure Boulesteix,et al.  Survival prediction using gene expression data: A review and comparison , 2009, Comput. Stat. Data Anal..

[32]  M. Teo,et al.  Doublet chemotherapy in the elderly patient with ovarian cancer. , 2012, The oncologist.

[33]  M. Dimopoulos,et al.  Epithelial ovarian carcinoma in younger vs older women: is age an independent prognostic factor? The Hellenic Oncology Cooperative Group experience , 2006, International Journal of Gynecologic Cancer.

[34]  Anna V. Ivshina,et al.  Syndrome approach for computer recognition of fuzzy systems and its application to immunological diagnostics and prognosis of human cancer , 1996 .

[35]  V. Kuznetsov,et al.  Targeting glioma stem cells by functional inhibition of a prosurvival oncomiR-138 in malignant gliomas. , 2012, Cell reports.

[36]  C. Tappert,et al.  A Survey of Binary Similarity and Distance Measures , 2010 .

[37]  Darcy A. Davis,et al.  Bringing Big Data to Personalized Healthcare: A Patient-Centered Framework , 2013, Journal of General Internal Medicine.

[38]  Joshy George,et al.  Genetic reclassification of histologic grade delineates new clinical subtypes of breast cancer. , 2006, Cancer research.

[39]  P. Westfall Improving Power by Dichotomizing (Even Under Normality) , 2011 .

[40]  Benjamin J. Raphael,et al.  Integrated Genomic Analyses of Ovarian Carcinoma , 2011, Nature.

[41]  Laurent Ozbun,et al.  A gene signature predicting for survival in suboptimally debulked patients with ovarian cancer. , 2008, Cancer research.

[42]  A. Nobel,et al.  Supervised risk predictor of breast cancer based on intrinsic subtypes. , 2009, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[43]  J. Wesseling,et al.  A prospective evaluation of a breast cancer prognosis signature in the observational RASTER study , 2013, International journal of cancer.

[44]  Leming Shi,et al.  Shifting from Population-wide to Personalized Cancer Prognosis with Microarrays , 2012, PloS one.