Link test - A statistical method for finding prostate cancer biomarkers

We present a new method, link-test, to select prostate cancer biomarkers from SELDI mass spectrometry and microarray data sets. Biomarkers selected by link-test are supported by data sets from both mRNA and protein levels, and therefore results in improved robustness. Link-test determines the level of significance of the association between a microarray marker and a specific mass spectrum marker by constructing background mass spectra distributions estimated by all human protein sequences in the SWISS-PROT database. The data set consist of both microarray and mass spectrometry data from prostate cancer patients and healthy controls. A list of statistically justified prostate cancer biomarkers is reported by link-test. Cross-validation results show high prediction accuracy using the identified biomarker panel. We also employ a text-mining approach with OMIM database to validate the cancer biomarkers. The study with link-test represents one of the first cross-platform studies of cancer biomarkers.

[1]  James Lyons-Weiler,et al.  Standards of Excellence and Open Questions in Cancer Biomarker Research: An Informatics Perspective , 2005, Cancer informatics.

[2]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[3]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[4]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[5]  E. Petricoin,et al.  Use of proteomic patterns in serum to identify ovarian cancer , 2002, The Lancet.

[6]  Jian Liu,et al.  Finding Cancer Biomarkers from Mass Spectrometry Data by Decision Lists , 2005, J. Comput. Biol..

[8]  D. Liebler Introduction to proteomics : tools for the new biology /by Daniel C. Liebler ; foreword by John R. Yates, III. , 2007 .

[9]  Joel D. Martin,et al.  Getting to the (c)ore of knowledge: mining biomedical literature , 2002, Int. J. Medical Informatics.

[10]  E. Lander,et al.  Gene expression correlates of clinical prostate cancer behavior. , 2002, Cancer cell.

[11]  Jeffrey S. Morris,et al.  Reproducibility of SELDI-TOF protein patterns in serum: comparing datasets from different experiments , 2004, Bioinform..

[12]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[13]  Steven J. M. Jones,et al.  CGMIM: Automated text-mining of Online Mendelian Inheritance in Man (OMIM) to identify genetically-associated cancers and candidate genes , 2005, BMC Bioinformatics.

[14]  Ronald J. Moore,et al.  Toward a Human Blood Serum Proteome , 2002, Molecular & Cellular Proteomics.

[15]  Bruce Randall Donald,et al.  Probabilistic Disease Classification of Expression-Dependent Proteomic Data from Mass Spectrometry of Human Serum , 2003, J. Comput. Biol..

[16]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[17]  G. Siuzdak The Expanding Role of Mass Spectrometry in Biotechnology , 2006 .

[18]  M S Pepe,et al.  Phases of biomarker development for early detection of cancer. , 2001, Journal of the National Cancer Institute.

[19]  Hugh M. Cartwright,et al.  SpecAlign - processing and alignment of mass spectra datasets , 2005, Bioinform..

[20]  Xuefeng Bruce Ling,et al.  Multiclass cancer classification and biomarker discovery using GA-based algorithms , 2005, Bioinform..

[21]  Hongyu Zhao,et al.  Detecting and aligning peaks in mass spectrometry data with applications to MALDI , 2006, Comput. Biol. Chem..

[22]  Ming Zhou,et al.  Cancer diagnosis using proteomic patterns , 2003, Expert review of molecular diagnostics.

[23]  Thomas L. Isenhour,et al.  Time-warping algorithm applied to chromatographic peak matching gas chromatography/Fourier transform infrared/mass spectrometry , 1987 .

[24]  E. Diamandis Mass Spectrometry as a Diagnostic and a Cancer Biomarker Discovery Tool , 2004, Molecular & Cellular Proteomics.

[25]  E. Petricoin,et al.  Serum proteomic patterns for detection of prostate cancer. , 2002, Journal of the National Cancer Institute.

[26]  Min Zhan,et al.  A data review and re-assessment of ovarian cancer serum proteomic profiling , 2003, BMC Bioinformatics.

[27]  Alex Pothen,et al.  Computational protein biomarker prediction: a case study for prostate cancer , 2004, BMC Bioinformatics.

[28]  Constantin F. Aliferis,et al.  A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis , 2004, Bioinform..