Data mining methods in Omics-based biomarker discovery.

The advent of Omics technologies as genomics and proteomics has brought the hope of discovering novel biomarkers that can be used to diagnose, predict, and monitor the progress of disease. The importance of data mining to identify biological markers for the diagnostic classification and prognostic assessment in the context of microarray and proteomic data has been increasingly recognized. We present an overview of general data mining methods and their applications to biomarker discovery with particular focus on genomics and proteomics data. Two case studies are exemplarily presented, and relevant data mining terminology and techniques are explained.

[1]  Fuu-Jen Tsai,et al.  Artificial neural network-based study can predict gastric cancer staging. , 2008, Hepato-gastroenterology.

[2]  Chih-Lin Chi,et al.  Application of Artificial Neural Network-Based Survival Analysis on Two Breast Cancer Datasets , 2007, AMIA.

[3]  C Carl Jaffe Pathology and imaging in biomarker development. , 2009, Archives of pathology & laboratory medicine.

[4]  Soonil Kwon,et al.  Application of Bayesian classification with singular value decomposition method in genome-wide association studies , 2009, BMC proceedings.

[5]  P. Selby,et al.  Proteomic profiling of urinary proteins in renal cancer by surface enhanced laser desorption ionization and neural-network analysis: identification of key issues affecting potential clinical utility. , 2003, Cancer research.

[6]  Hau-San Wong,et al.  A neural network-based biomarker association information extraction approach for cancer classification , 2009, J. Biomed. Informatics.

[7]  Anshu Saksena,et al.  Bayesian model selection for mining mass spectrometry data , 2005, Neural Networks.

[8]  Christophe Lemetre,et al.  An introduction to artificial neural networks in bioinformatics - application to complex microarray and mass spectrometry datasets in cancer studies , 2008, Briefings Bioinform..

[9]  Guillaume J. Filion,et al.  Bayesian network analysis of targeting interactions in chromatin. , 2010, Genome research.

[10]  P. Robinson,et al.  Walking the interactome for prioritization of candidate disease genes. , 2008, American journal of human genetics.

[11]  Ilias Maglogiannis,et al.  Neural network-based diagnostic and prognostic estimations in breast cancer microscopic instances , 2006, Medical and Biological Engineering and Computing.

[12]  Perry L. Miller,et al.  Journal of Biomedical Informatics 40 (2007) 750–760 , 2006 .

[13]  Kjetil Søreide,et al.  Receiver-operating characteristic curve analysis in diagnostic, prognostic and predictive biomarker research , 2008, Journal of Clinical Pathology.

[14]  Hesham H. Ali,et al.  Cross-platform Analysis of Cancer Biomarkers: A Bayesian Network Approach to Incorporating Mass Spectrometry and Microarray Data , 2007 .

[15]  S. Adelstein,et al.  Integrative Genomic Data Mining for Discovery of Potential Blood-Borne Biomarkers for Early Diagnosis of Cancer , 2008, PloS one.

[16]  Mário Sarcinelli-Filho,et al.  The use of bayesian networks for heart beat classification. , 2010, Advances in experimental medicine and biology.

[17]  Ming Zhou,et al.  Cancer diagnosis using proteomic patterns , 2003, Expert review of molecular diagnostics.

[18]  Wei Pan,et al.  Network-based support vector machine for classification of microarray samples , 2009, BMC Bioinformatics.

[19]  Adnan Darwiche,et al.  7 Inference in Bayesian Networks : A Historical Perspective , 2009 .

[20]  D. Rimm,et al.  Classification of Breast Cancer Using Genetic Algorithms and Tissue Microarrays , 2006, Clinical Cancer Research.

[21]  Ewan Birney,et al.  Advanced Genomic Data Mining , 2008, PLoS Comput. Biol..

[22]  J. Ji,et al.  Diagnosis of gastric cancer using decision tree classification of mass spectral data , 2007, Cancer Science.

[23]  Kelvin H. Lee,et al.  Genomic analysis. , 2000, Current opinion in biotechnology.

[24]  J. Chen,et al.  Disease gene-fishing in molecular interaction networks: A case study in colorectal cancer , 2009, 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[25]  David R Westhead,et al.  Inference in Bayesian networks , 2006, Nature Biotechnology.

[26]  Jake Y. Chen,et al.  GeneTerrain: visual exploration of differential gene expression profiles organized in native biomolecular interaction networks , 2010 .

[27]  E. Petricoin,et al.  SELDI-TOF-based serum proteomic pattern diagnostics for early detection of cancer. , 2004, Current opinion in biotechnology.

[28]  Akbar Fotouhi,et al.  Assessment of gastric cancer survival: using an artificial hierarchical neural network. , 2008, Pakistan journal of biological sciences : PJBS.

[29]  M. Tan,et al.  Constructing Tumor Progression Pathways and Biomarker Discovery with Fuzzy Kernel Kmeans and DNA Methylation Data , 2008, Cancer informatics.

[30]  Lisa H Cazares,et al.  Serum proteomic biomarker discovery reflective of stage and obesity in breast cancer patients. , 2009, Journal of the American College of Surgeons.

[31]  Isaac S. Kohane,et al.  A Practical Platform for Blood Biomarker Study by Using Global Gene Expression Profiling of Peripheral Whole Blood , 2009, PloS one.