Review Biomarker Gene Signature Discovery Integrating Network Knowledge

Discovery of prognostic and diagnostic biomarker gene signatures for diseases, such as cancer, is seen as a major step towards a better personalized medicine. During the last decade various methods, mainly coming from the machine learning or statistical domain, have been proposed for that purpose. However, one important obstacle for making gene signatures a standard tool in clinical diagnosis is the typical low reproducibility of these signatures combined with the difficulty to achieve a clear biological interpretation. For that purpose in the last years there has been a growing interest in approaches that try to integrate information from molecular interaction networks. Here we review the current state of research in this field by giving an overview about so-far proposed approaches.

[1]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[2]  Michel Lang,et al.  Survival models with preclustered gene groups as covariates , 2011, BMC Bioinformatics.

[3]  R. Tibshirani,et al.  Diagnosis of multiple cancer types by shrunken centroids of gene expression , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[4]  David Warde-Farley,et al.  Dynamic modularity in protein interaction networks predicts breast cancer outcome , 2009, Nature Biotechnology.

[5]  Igor Jurisica,et al.  Inferring the functions of longevity genes with modular subnetwork biomarkers of Caenorhabditis elegans aging , 2010, Genome Biology.

[6]  Salim A. Chowdhury,et al.  Identification of Coordinately Dysregulated Subnetworks in Complex Phenotypes , 2010, Pacific Symposium on Biocomputing.

[7]  Yixin Chen,et al.  Graph ranking for exploratory gene data analysis , 2009, BMC Bioinformatics.

[8]  T. Ideker,et al.  Network-based classification of breast cancer metastasis , 2007, Molecular systems biology.

[9]  Emmanuel Barillot,et al.  Classification of microarray data using gene networks , 2007, BMC Bioinformatics.

[10]  Xiaodong Lin,et al.  Gene expression Gene selection using support vector machines with non-convex penalty , 2005 .

[11]  R. Tibshirani,et al.  Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Jelle J. Goeman,et al.  A global test for groups of genes: testing association with a clinical outcome , 2004, Bioinform..

[13]  Yves Moreau,et al.  Network Analysis of Differential Expression for the Identification of Disease-Causing Genes , 2009, PloS one.

[14]  Mithat Gönen,et al.  Statistical aspects of gene signatures and molecular targets. , 2009, Gastrointestinal cancer research : GCR.

[15]  Michalis E. Blazadonakis,et al.  Integration of gene signatures using biological knowledge , 2011, Artif. Intell. Medicine.

[16]  Tim Beißbarth,et al.  Graph based fusion of miRNA and mRNA expression data improves clinical outcome prediction in prostate cancer , 2011, BMC Bioinformatics.

[17]  Aixia Guo,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2014 .

[18]  Alex Arenas,et al.  Improved prognostic classification of breast cancer defined by antagonistic activation patterns of immune response pathway modules , 2010, BMC Cancer.

[19]  Tobias Müller,et al.  Identifying functional modules in protein–protein interaction networks: an integrated exact approach , 2008, ISMB.

[20]  A. I.,et al.  Neural Field Continuum Limits and the Structure–Function Partitioning of Cognitive–Emotional Brain Networks , 2023, Biology.

[21]  Hongzhe Li,et al.  In Response to Comment on "Network-constrained regularization and variable selection for analysis of genomic data" , 2008, Bioinform..

[22]  Wei Pan,et al.  Network-based support vector machine for classification of microarray samples , 2009, BMC Bioinformatics.

[23]  Sean R. Collins,et al.  Toward a Comprehensive Atlas of the Physical Interactome of Saccharomyces cerevisiae*S , 2007, Molecular & Cellular Proteomics.

[24]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[25]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[26]  David G. Stork,et al.  Pattern Classification , 1973 .

[27]  Robert Clarke,et al.  Identifying cancer biomarkers by network-constrained support vector machines , 2011, BMC Systems Biology.

[28]  Martin Ester,et al.  Inferring cancer subnetwork markers using density-constrained biclustering , 2010, Bioinform..

[29]  Noga Alon,et al.  Biomolecular network motif counting and discovery by color coding , 2008, ISMB.

[30]  Gary D. Bader,et al.  Pathway Commons, a web resource for biological pathway data , 2010, Nucleic Acids Res..

[31]  Holger Fröhlich,et al.  Integration of pathway knowledge into a reweighted recursive feature elimination approach for risk stratification of cancer patients , 2010, Bioinform..

[32]  David Haussler,et al.  Inference of patient-specific pathway activities from multi-dimensional cancer genomics data using PARADIGM , 2010, Bioinform..

[33]  B Marshall,et al.  Gene Ontology Consortium: The Gene Ontology (GO) database and informatics resource , 2004, Nucleic Acids Res..

[34]  Edward R. Dougherty,et al.  Identification of diagnostic subnetwork markers for cancer in human protein-protein interaction network , 2010, BMC Bioinformatics.

[35]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[36]  Francis J. Doyle,et al.  Core module biomarker identification with network exploration for breast cancer metastasis , 2012, BMC Bioinformatics.

[37]  Axel Benner,et al.  Elastic SCAD as a novel penalization method for SVM classification tasks in high-dimensional data , 2011, BMC Bioinformatics.

[38]  Trey Ideker,et al.  Protein Networks as Logic Functions in Development and Cancer , 2011, PLoS Comput. Biol..

[39]  Shi-Hua Zhang,et al.  Detecting disease associated modules and prioritizing active genes based on high throughput data , 2010, BMC Bioinformatics.

[40]  Qing Wang,et al.  Towards precise classification of cancers based on robust gene functional expression profiles , 2005, BMC Bioinformatics.

[41]  Sanghyun Park,et al.  Integrative gene network construction for predicting a set of complementary prostate cancer genes , 2011, Bioinform..

[42]  Harald Binder,et al.  Incorporating pathway information into boosting estimation of high-dimensional risk prediction models , 2009, BMC Bioinformatics.

[43]  Yoshihiro Yamanishi,et al.  KEGG for linking genomes to life and the environment , 2007, Nucleic Acids Res..

[44]  John D. Lafferty,et al.  Diffusion Kernels on Graphs and Other Discrete Input Spaces , 2002, ICML.

[45]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[46]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[47]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[48]  Li Wang,et al.  Hybrid huberized support vector machines for microarray classification , 2007, ICML '07.

[49]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[50]  J. Goeman L1 Penalized Estimation in the Cox Proportional Hazards Model , 2009, Biometrical journal. Biometrische Zeitschrift.

[51]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[52]  Holger Fröhlich,et al.  Prognostic gene signatures for patient stratification in breast cancer - accuracy, stability and interpretability of gene selection approaches using prior knowledge on protein-protein interactions , 2012, BMC Bioinformatics.

[53]  Desmond J. Higham,et al.  GeneRank: Using search engine technology for the analysis of microarray experiments , 2005, BMC Bioinformatics.

[54]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[55]  A. N. Tikhonov,et al.  Solutions of ill-posed problems , 1977 .

[56]  Jeffrey T. Chang,et al.  Oncogenic pathway signatures in human cancers as a guide to targeted therapies , 2006, Nature.

[57]  Doheon Lee,et al.  Inferring Pathway Activity toward Precise Disease Classification , 2008, PLoS Comput. Biol..

[58]  Akhilesh Pandey,et al.  Human Protein Reference Database and Human Proteinpedia as discovery tools for systems biology. , 2009, Methods in molecular biology.

[59]  Salim A. Chowdhury,et al.  Subnetwork State Functions Define Dysregulated Subnetworks in Cancer , 2010, J. Comput. Biol..

[60]  Ramón Díaz-Uriarte,et al.  Gene selection and classification of microarray data using random forest , 2006, BMC Bioinformatics.

[61]  Gene Ontology Consortium The Gene Ontology (GO) database and informatics resource , 2003 .

[62]  Yi Zhang,et al.  Pathway analysis of gene signatures predicting metastasis of node-negative primary breast cancer , 2007, BMC Cancer.

[63]  R. Spang,et al.  Pathway activation patterns in diffuse large B-cell lymphomas , 2008, Leukemia.

[64]  Martin Ester,et al.  Optimally discriminative subnetwork markers predict response to chemotherapy , 2011, Bioinform..