Kernelizing the output of tree-based methods

We extend tree-based methods to the prediction of structured outputs using a kernelization of the algorithm that allows one to grow trees as soon as a kernel can be defined on the output space. The resulting algorithm, called output kernel trees (OK3), generalizes classification and regression trees as well as tree-based ensemble methods in a principled way. It inherits several features of these methods such as interpretability, robustness to irrelevant variables, and input scalability. When only the Gram matrix over the outputs of the learning sample is given, it learns the output kernel as a function of inputs. We show that the proposed algorithm works well on an image reconstruction task and on a biological network inference problem.

[1]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[2]  Ben Taskar,et al.  Learning structured prediction models: a large margin approach , 2005, ICML.

[3]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[4]  Yoshihiro Yamanishi,et al.  Supervised enzyme network inference from the integration of genomic data and chemical information , 2005, ISMB.

[5]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[6]  Yoshihiro Yamanishi,et al.  Protein network inference from multiple genomic data: a supervised approach , 2004, ISMB/ECCB.

[7]  Liva Ralaivola,et al.  Dynamical Modeling with Kernels for Nonlinear Time Series Prediction , 2003, NIPS.

[8]  Bernhard Schölkopf,et al.  Kernel Dependency Estimation , 2002, NIPS.

[9]  Luc De Raedt,et al.  Top-Down Induction of Clustering Trees , 1998, ICML.

[10]  M. Segal Tree-Structured Methods for Longitudinal Data , 1992 .

[11]  Yoshihiro Yamanishi,et al.  Supervised Graph Inference , 2004, NIPS.

[12]  Saso Dzeroski,et al.  Ranking with Predictive Clustering Trees , 2002, ECML.

[13]  Inderjit S. Dhillon,et al.  Kernel k-means: spectral clustering and normalized cuts , 2004, KDD.

[14]  Bernhard Schölkopf,et al.  Joint Kernel Maps , 2005, IWANN.

[15]  Maurice Bruynooghe,et al.  Hierarchical multi-classification , 2002, KDD 2002.

[16]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[17]  Kiyoshi Asai,et al.  The em Algorithm for Kernel Matrix Completion with Auxiliary Data , 2003, J. Mach. Learn. Res..

[18]  Jason Weston,et al.  A general regression technique for learning transductions , 2005, ICML '05.

[19]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[20]  John D. Lafferty,et al.  Diffusion Kernels on Graphs and Other Discrete Input Spaces , 2002, ICML.