PAC-Bayes Learning of Conjunctions and Classification of Gene-Expression Data

We propose a "soft greedy" learning algorithm for building small conjunctions of simple threshold functions, called rays, defined on single real-valued attributes. We also propose a PAC-Bayes risk bound which is minimized for classifiers achieving a non-trivial tradeoff between sparsity (the number of rays used) and the magnitude of the separating margin of each ray. Finally, we test the soft greedy algorithm on four DNA micro-array data sets.

[1]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[2]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[3]  David Haussler,et al.  Quantifying Inductive Bias: AI Learning Algorithms and Valiant's Learning Framework , 1988, Artif. Intell..

[4]  Nello Cristianini,et al.  Support vector machine classification and validation of cancer tissue samples using microarray expression data , 2000, Bioinform..

[5]  Matthias W. Seeger,et al.  PAC-Bayesian Generalisation Error Bounds for Gaussian Process Classification , 2003, J. Mach. Learn. Res..

[6]  T. Poggio,et al.  Prediction of central nervous system embryonal tumour outcome based on gene expression , 2002, Nature.

[7]  David A. McAllester Some PAC-Bayesian Theorems , 1998, COLT' 98.

[8]  John Shawe-Taylor,et al.  The Set Covering Machine , 2003, J. Mach. Learn. Res..

[9]  David A. McAllester PAC-Bayesian Stochastic Model Selection , 2003, Machine Learning.

[10]  J. Langford Tutorial on Practical Prediction Theory for Classification , 2005, J. Mach. Learn. Res..

[11]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Geoffrey J McLachlan,et al.  Selection bias in gene extraction on the basis of microarray gene-expression data , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[13]  N. Littlestone Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).