Semi-Supervised Learning with Trees

We describe a nonparametric Bayesian approach to generalizing from few labeled examples, guided by a larger set of unlabeled objects and the assumption of a latent tree-structure to the domain. The tree (or a distribution over trees) may be inferred using the unlabeled data. A prior over concepts generated by a mutation process on the inferred tree(s) allows efficient computation of the optimal Bayesian classification function from the labeled examples. We test our approach on eight real-world datasets.

[1]  Miss A.O. Penney (b) , 1974, The New Yale Book of Quotations.

[2]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[3]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[4]  Avrim Blum,et al.  Learning from Labeled and Unlabeled Data using Graph Mincuts , 2001, ICML.

[5]  Tommi S. Jaakkola,et al.  Partially labeled classification with Markov random walks , 2001, NIPS.

[6]  Joshua B. Tenenbaum,et al.  Bayesian Models of Inductive Generalization , 2002, NIPS.

[7]  Bernhard Schölkopf,et al.  Cluster Kernels for Semi-Supervised Learning , 2002, NIPS.

[8]  M. Rattray,et al.  Bayesian phylogenetics using an RNA substitution model applied to early mammalian evolution. , 2002, Molecular biology and evolution.

[9]  Jean-Philippe Vert A tree kernel to analyze phylog enetic profi les , 2002 .

[10]  Jean-Philippe Vert,et al.  A tree kernel to analyse phylogenetic profiles , 2002, ISMB.

[11]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[12]  Joshua B. Tenenbaum,et al.  Theory-Based Induction , 2003 .

[13]  David R. Karger,et al.  Learning Classes Correlated to a Hierarchy , 2003 .

[14]  R. Schapire,et al.  Bounds on the Sample Complexity of Bayesian Learning Using Information Theory and the VC Dimension , 1991, Machine Learning.

[15]  Mikhail Belkin,et al.  Semi-Supervised Learning on Riemannian Manifolds , 2004, Machine Learning.