Big-data-based edge biomarkers: study on dynamical drug sensitivity and resistance in individuals

Big-data-based edge biomarker is a new concept to characterize disease features based on biomedical big data in a dynamical and network manner, which also provides alternative strategies to indicate disease status in single samples. This article gives a comprehensive review on big-data-based edge biomarkers for complex diseases in an individual patient, which are defined as biomarkers based on network information and high-dimensional data. Specifically, we firstly introduce the sources and structures of biomedical big data accessible in public for edge biomarker and disease study. We show that biomedical big data are typically 'small-sample size in high-dimension space', i.e. small samples but with high dimensions on features (e.g. omics data) for each individual, in contrast to traditional big data in many other fields characterized as 'large-sample size in low-dimension space', i.e. big samples but with low dimensions on features. Then, we demonstrate the concept, model and algorithm for edge biomarkers and further big-data-based edge biomarkers. Dissimilar to conventional biomarkers, edge biomarkers, e.g. module biomarkers in module network rewiring-analysis, are able to predict the disease state by learning differential associations between molecules rather than differential expressions of molecules during disease progression or treatment in individual patients. In particular, in contrast to using the information of the common molecules or edges (i.e.molecule-pairs) across a population in traditional biomarkers including network and edge biomarkers, big-data-based edge biomarkers are specific for each individual and thus can accurately evaluate the disease state by considering the individual heterogeneity. Therefore, the measurement of big data in a high-dimensional space is required not only in the learning process but also in the diagnosing or predicting process of the tested individual. Finally, we provide a case study on analyzing the temporal expression data from a malaria vaccine trial by big-data-based edge biomarkers from module network rewiring-analysis. The illustrative results show that the identified module biomarkers can accurately distinguish vaccines with or without protection and outperformed previous reported gene signatures in terms of effectiveness and efficiency.

[1]  F. Marincola,et al.  Commensal Bacteria Control Cancer Response to Therapy by Modulating the Tumor Microenvironment , 2013, Science.

[2]  Russ B Altman,et al.  Predicting cancer drug response: advancing the DREAM. , 2015, Cancer discovery.

[3]  Jinwen Ma,et al.  Compound signature detection on LINCS L1000 big data. , 2015, Molecular bioSystems.

[4]  K. Aihara,et al.  Early Diagnosis of Complex Diseases by Molecular Biomarkers, Network Biomarkers, and Dynamical Network Biomarkers , 2014, Medicinal research reviews.

[5]  G. Fuellen,et al.  Differential Network Analysis Applied to Preoperative Breast Cancer Chemotherapy Response , 2013, PloS one.

[6]  Charles Auffray,et al.  Predictive, preventive, personalized and participatory medicine: back to the future , 2010, Genome Medicine.

[7]  Kazuyuki Aihara,et al.  Detecting early-warning signals for sudden deterioration of complex diseases by dynamical network biomarkers , 2012, Scientific Reports.

[8]  D. Haussler,et al.  Exploring TCGA Pan-Cancer Data at the UCSC Cancer Genomics Browser , 2013, Scientific Reports.

[9]  Hong Zhou,et al.  Classification of Time Series Gene Expression in Clinical Studies via Integration of Biological Network , 2013, PloS one.

[10]  Andreas Krämer,et al.  Causal analysis approaches in Ingenuity Pathway Analysis , 2013, Bioinform..

[11]  Mark Cobbold,et al.  Tracking Genomic Cancer Evolution for Precision Medicine: The Lung TRACERx Study , 2014, PLoS biology.

[12]  P. Matthews,et al.  Pathway and network-based analysis of genome-wide association studies in multiple sclerosis , 2009, Human molecular genetics.

[13]  F. Goodsaid,et al.  Translational Medicine and the Value of Biomarker Qualification , 2010, Science Translational Medicine.

[14]  Jianmin Zhang Translational medicine in China , 2012, Science China Life Sciences.

[15]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[16]  L. Hood,et al.  Predictive, personalized, preventive, participatory (P4) cancer medicine , 2011, Nature Reviews Clinical Oncology.

[17]  Doheon Lee,et al.  Inferring Pathway Activity toward Precise Disease Classification , 2008, PLoS Comput. Biol..

[18]  M. Peplow The 100 000 Genomes Project , 2016, British Medical Journal.

[19]  Susumu Goto,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 2000, Nucleic Acids Res..

[20]  Christopher-Paul Milne,et al.  Translational Medicine: An Engine of Change for Bringing New Technology to Community Health , 2009, Science Translational Medicine.

[21]  Justin Guinney,et al.  GSVA: gene set variation analysis for microarray and RNA-Seq data , 2013, BMC Bioinformatics.

[22]  Zhining Wang,et al.  Expression of genes associated with immunoproteasome processing of major histocompatibility complex peptides is indicative of protection with adjuvanted RTS,S malaria vaccine. , 2010, The Journal of infectious diseases.

[23]  J. Kaiser National Institutes of Health. A government niche for translational medicine and drug development. , 2010, Science.

[24]  Sean R. Davis,et al.  NCBI GEO: archive for functional genomics data sets—update , 2012, Nucleic Acids Res..

[25]  T. Ideker,et al.  Network-based classification of breast cancer metastasis , 2007, Molecular systems biology.

[26]  P. Villoslada,et al.  Data integration and systems biology approaches for biomarker discovery: Challenges and opportunities for multiple sclerosis , 2012, Journal of Neuroimmunology.

[27]  T. Su,et al.  Variant GADL1 and response to lithium therapy in bipolar I disorder. , 2014, The New England journal of medicine.

[28]  T. Misteli,et al.  Progeria: A Paradigm for Translational Medicine , 2014, Cell.

[29]  Yong Wang,et al.  Spatio-temporal analysis of type 2 diabetes mellitus based on differential expression networks , 2013, Scientific Reports.

[30]  UK catapults precision medicine , 2015, Nature Biotechnology.

[31]  Yixue Li,et al.  Big Biological Data: Challenges and Opportunities , 2014, Genom. Proteom. Bioinform..

[32]  Peter Li,et al.  GigaDB: promoting data dissemination and reproducibility , 2014, Database J. Biol. Databases Curation.

[33]  I. Simon,et al.  Studying and modelling dynamic biological processes using time-series gene expression data , 2012, Nature Reviews Genetics.

[34]  Milton W. Taylor,et al.  Cyclic changes in gene expression induced by Peg-interferon alfa-2b plus ribavirin in peripheral blood monocytes (PBMC) of hepatitis C patients during the first 10 weeks of treatment , 2008, Journal of Translational Medicine.

[35]  Hugo Y. K. Lam,et al.  Personal Omics Profiling Reveals Dynamic Molecular and Medical Phenotypes , 2012, Cell.

[36]  Karen Cichowski,et al.  Drug-Induced Death Signaling Strategy Rapidly Predicts Cancer Response to Chemotherapy , 2015, Cell.

[37]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.

[38]  Liang Chen,et al.  A statistical method for identifying differential gene-gene co-expression patterns , 2004, Bioinform..

[39]  Koichi Kanai,et al.  HCV genotypes in chronic hepatitis C and response to interferon , 1992, The Lancet.

[40]  Mark Gerstein,et al.  Genomics: ENCODE leads the way on big data , 2012, Nature.

[41]  Michal Sheffer,et al.  Pathway-based personalized analysis of cancer , 2013, Proceedings of the National Academy of Sciences.

[42]  Xiufen Zou,et al.  Deciphering deterioration mechanisms of complex diseases based on the construction of dynamic networks and systems analysis , 2015, Scientific Reports.

[43]  Alex J Walsh,et al.  Quantitative optical imaging of primary tumor organoid metabolism predicts drug response in breast cancer. , 2014, Cancer research.

[44]  C. Mason,et al.  Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data , 2013, Genome Biology.

[45]  Michael Schroeder,et al.  Google Goes Cancer: Improving Outcome Prediction for Cancer Patients by Network-Based Ranking of Marker Genes , 2012, PLoS Comput. Biol..

[46]  V. Hayward,et al.  Big data: The next Google , 2008, Nature.

[47]  Cameron J. Schweitzer,et al.  Impact of host and virus genome variability on HCV replication and response to interferon. , 2013, Current opinion in virology.

[48]  Minghua Deng,et al.  Comparison of metagenomic samples using sequence signatures , 2012, BMC Genomics.

[49]  Holger Fröhlich,et al.  netClass: an R-package for network based, integrative biomarker signature discovery , 2014, Bioinform..

[50]  Susumu Goto,et al.  KEGG for integration and interpretation of large-scale molecular data sets , 2011, Nucleic Acids Res..

[51]  Xuegong Zhang,et al.  Opportunities and methods for studying alternative splicing in cancer with RNA-Seq. , 2013, Cancer letters.

[52]  Juan Liu,et al.  Negative correlation based gene markers identification in integrative gene expression data , 2014, Int. J. Data Min. Bioinform..

[53]  John D. Minna,et al.  GWAS Meets TCGA to Illuminate Mechanisms of Cancer Predisposition , 2013, Cell.

[54]  Rodrigo Lopez,et al.  Analysis Tool Web Services from the EMBL-EBI , 2013, Nucleic Acids Res..

[55]  S. Bertholet,et al.  Unleashing the potential of NOD- and Toll-like agonists as vaccine adjuvants , 2014, Proceedings of the National Academy of Sciences.

[56]  Damian Szklarczyk,et al.  The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored , 2010, Nucleic Acids Res..

[57]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[58]  Xiangtian Yu,et al.  Prediction and early diagnosis of complex diseases by edge-network , 2014, Bioinform..

[59]  Ashley J. Birkett,et al.  Phase I Testing of a Malaria Vaccine Composed of Hepatitis B Virus Core Particles Expressing Plasmodium falciparum Circumsporozoite Epitopes , 2004, Infection and Immunity.

[60]  F. Collins,et al.  A new initiative on precision medicine. , 2015, The New England journal of medicine.

[61]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[62]  Jian Zhu,et al.  Systematic identification of transcriptional and post-transcriptional regulations in human respiratory epithelial cells during influenza A virus infection , 2014, BMC Bioinformatics.

[63]  S. Lehrer Association between malaria incidence and all cancer mortality in fifty U.S. States and the District of Columbia. , 2010, Anticancer research.

[64]  Gil McVean,et al.  The 100,000 Genomes Project Protocol , 2017 .

[65]  Wanwei Zhang,et al.  Deciphering early development of complex diseases by progressive module network. , 2014, Methods.

[66]  Luonan Chen,et al.  Computational systems biology in the big data era , 2013, BMC Systems Biology.

[67]  N. Wagle,et al.  Precision Medicine in Breast Cancer: Genes, Genomes, and the Future of Genomically Driven Treatments , 2015, Current Oncology Reports.

[68]  N. Neff,et al.  Temporal Response of the Human Virome to Immunosuppression and Antiviral Therapy , 2013, Cell.

[69]  Zhi-Ping Liu,et al.  Gaussian graphical model for identifying significantly responsive regulatory networks from time course high-throughput data. , 2013, IET systems biology.

[70]  Gareth Highnam,et al.  Personal genomes and precision medicine , 2012, Genome Biology.

[71]  Gary D. Bader,et al.  Cytoscape Web: an interactive web-based network browser , 2010, Bioinform..

[72]  Euan A Ashley,et al.  Using "big data" to dissect clinical heterogeneity. , 2015, Circulation.

[73]  L. Carin,et al.  Temporal Dynamics of Host Molecular Responses Differentiate Symptomatic and Asymptomatic Influenza A Infection , 2011, PLoS genetics.

[74]  I. Ellis,et al.  SHON is a novel estrogen-regulated oncogene in mammary carcinoma that predicts patient response to endocrine therapy. , 2013, Cancer research.

[75]  Rodrigo Lopez,et al.  A new bioinformatics analysis tools framework at EMBL–EBI , 2010, Nucleic Acids Res..

[76]  Wanwei Zhang,et al.  EdgeMarker: Identifying differentially correlated molecule pairs as edge-biomarkers. , 2014, Journal of theoretical biology.

[77]  Yu Shyr,et al.  The prediction of interferon treatment effects based on time series microarray gene expression profiles , 2008, Journal of Translational Medicine.

[78]  Ming Ouyang,et al.  DNA microarray data imputation and significance analysis of differential expression , 2005, Bioinform..

[79]  Jing Cui,et al.  Genome-Wide Association Study and Gene Expression Analysis Identifies CD84 as a Predictor of Response to Etanercept Therapy in Rheumatoid Arthritis , 2013, PLoS genetics.

[80]  L. Hood,et al.  A personal view on systems medicine and the emergence of proactive P4 medicine: predictive, preventive, personalized and participatory. , 2012, New biotechnology.

[81]  N. Cox,et al.  Clinical drug response can be predicted using baseline gene expression levels and in vitro drug sensitivity in cell lines , 2014, Genome Biology.

[82]  Qing Wang,et al.  Towards precise classification of cancers based on robust gene functional expression profiles , 2005, BMC Bioinformatics.

[83]  M. Sällberg,et al.  A Malaria Vaccine Candidate Based on a Hepatitis B Virus Core Platform , 2003, Intervirology.

[84]  Adam A. Margolin,et al.  Addendum: The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity , 2012, Nature.

[85]  Brody Sandel,et al.  Limited sampling hampers “big data” estimation of species richness in a tropical biodiversity hotspot , 2015, Ecology and evolution.

[86]  D. Pe’er,et al.  An Integrated Approach to Uncover Drivers of Cancer , 2010, Cell.

[87]  E. Murphy,et al.  Global epidemiology of HTLV-I infection and associated diseases , 2005, Oncogene.

[88]  Andrew D. Rouillard,et al.  LINCS Canvas Browser: interactive web app to query, browse and interrogate LINCS L1000 gene expression signatures , 2014, Nucleic Acids Res..

[89]  Xiaoping Liu,et al.  Diagnosing phenotypes of single-sample individuals by edge biomarkers. , 2015, Journal of molecular cell biology.

[90]  K. Kohn,et al.  Using drug response data to identify molecular effectors, and molecular “omic” data to identify candidate drugs in cancer , 2014, Human Genetics.

[91]  Hailong Zhu,et al.  Network biomarkers reveal dysfunctional gene regulations during disease progression , 2013, The FEBS journal.

[92]  Ni Li,et al.  Gene Ontology Annotations and Resources , 2012, Nucleic Acids Res..

[93]  Dennis B. Troup,et al.  NCBI GEO: archive for functional genomics data sets—10 years on , 2010, Nucleic Acids Res..

[94]  S. Terry Obama's Precision Medicine Initiative. , 2015, Genetic testing and molecular biomarkers.

[95]  Emily S. Sena,et al.  Bringing rigour to translational medicine , 2014, Nature Reviews Neurology.

[96]  R. Karchin,et al.  Collections of simultaneously altered genes as biomarkers of cancer cell drug response. , 2013, Cancer research.

[97]  X-S Zhang,et al.  Predicting cooperative drug effects through the quantitative cellular profiling of response to individual drugs , 2014, CPT: pharmacometrics & systems pharmacology.

[98]  Trey Ideker,et al.  Cytoscape 2.8: new features for data integration and network visualization , 2010, Bioinform..

[99]  T. Ideker,et al.  Differential network biology , 2012, Molecular systems biology.

[100]  Sohail Asghar,et al.  A REVIEW OF FEATURE SELECTION TECHNIQUES IN STRUCTURE LEARNING , 2013 .

[101]  Dennis C. Friedrich,et al.  Whole-exome sequencing and clinical interpretation of formalin-fixed , paraffin-embedded tumor samples to guide precision cancer medicine , 2014 .

[102]  J. Licinio,et al.  Improving the efficacy of translational medicine by optimally integrating health care, academia and industry , 2011, Nature Medicine.

[103]  Concha Bielza,et al.  Machine Learning in Bioinformatics , 2008, Encyclopedia of Database Systems.

[104]  Xiaofeng Zhu,et al.  Genome-wide searching of rare genetic variants in WTCCC data , 2010, Human Genetics.

[105]  Ziv Bar-Joseph,et al.  Alignment and classification of time series gene expression in clinical studies , 2008, ISMB.

[106]  Peter Li,et al.  GigaDB: announcing the GigaScience database , 2012, GigaScience.

[107]  S. Peiper,et al.  From malaria to chemokine receptor: the emerging physiologic role of the Duffy blood group antigen. , 1997, Blood.

[108]  Xiang-Sun Zhang,et al.  APG: an Active Protein-Gene Network Model to Quantify Regulatory Signals in Complex Biological Systems , 2013, Scientific Reports.

[109]  Rui Liu,et al.  Edge biomarkers for classification and prediction of phenotypes , 2014, Science China Life Sciences.

[110]  Luonan Chen,et al.  Systems biology with omics data. , 2014, Methods.

[111]  L. Greller,et al.  Transcription-Based Prediction of Response to IFNβ Using Supervised Computational Methods , 2004, PLoS biology.

[112]  Leroy Hood,et al.  Integrating big data and actionable health coaching to optimize wellness , 2015, BMC Medicine.

[113]  Joshua M. Korn,et al.  Studying clonal dynamics in response to cancer therapy using high-complexity barcoding , 2015, Nature Medicine.

[114]  Tao Zeng,et al.  Prediction of dynamical drug sensitivity and resistance by module network rewiring-analysis based on transcriptional profiling. , 2014, Drug resistance updates : reviews and commentaries in antimicrobial and anticancer chemotherapy.

[115]  M. Vidal,et al.  Edgotype: a fundamental link between genotype and phenotype. , 2013, Current opinion in genetics & development.

[116]  M. McCarthy Obama seeks $213m to fund “precision medicine” , 2015, BMJ : British Medical Journal.

[117]  Discovery of genetic biomarkers contributing to variation in drug response of cytidine analogues using human lymphoblastoid cell lines , 2014, BMC Genomics.

[118]  中尾 光輝,et al.  KEGG(Kyoto Encyclopedia of Genes and Genomes)〔和文〕 (特集 ゲノム医学の現在と未来--基礎と臨床) -- (データベース) , 2000 .

[119]  Kenny Q. Ye,et al.  An integrated map of genetic variation from 1,092 human genomes , 2012, Nature.

[120]  Kazuyuki Aihara,et al.  Identifying critical transitions of complex diseases based on a single sample , 2014, Bioinform..

[121]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[122]  S. Ampurdanés,et al.  Relationship of the genomic complexity of hepatitis C virus with liver disease severity and response to interferon in patients with chronic HCV genotype 1b interferon , 1999, Hepatology.

[123]  P. Laird,et al.  Discovery of multi-dimensional modules by integrative analysis of cancer genomic data , 2012, Nucleic acids research.

[124]  Nicholas F. Marko,et al.  Mathematical Modeling of Molecular Data in Translational Medicine: Theoretical Considerations , 2010, Science Translational Medicine.

[125]  K. Coombes,et al.  Independent validation of a model using cell line chemosensitivity to predict response to therapy. , 2013, Journal of the National Cancer Institute.