Serendipitous learning: learning beyond the predefined label space

Most traditional supervised learning methods are developed to learn a model from labeled examples and use this model to classify the unlabeled ones into the same label space predefined by the models. However, in many real world applications, the label spaces for both the labeled/training and unlabeled/testing examples can be different. To solve this problem, this paper proposes a novel notion of Serendipitous Learning (SL), which is defined to address the learning scenarios in which the label space can be enlarged during the testing phase. In particular, a large margin approach is proposed to solve SL. The basic idea is to leverage the knowledge in the labeled examples to help identify novel/unknown classes, and the large margin formulation is proposed to incorporate both the classification loss on the examples within the known categories, as well as the clustering loss on the examples in unknown categories. An efficient optimization algorithm based on CCCP and the bundle method is proposed to solve the optimization problem of the large margin formulation of SL. Moreover, an efficient online learning method is proposed to address the issue of large scale data in online learning scenario, which has been shown to have a guaranteed learning regret. An extensive set of experimental results on two synthetic datasets and two datasets from real world applications demonstrate the advantages of the proposed method over several other baseline algorithms. One limitation of the proposed method is that the number of unknown classes is given in advance. It may be possible to remove this constraint if we model it by using a non-parametric way. We also plan to do experiments on more real world applications in the future.

[1]  James Allan,et al.  Topic detection and tracking: event-based information organization , 2002 .

[2]  Ivor W. Tsang,et al.  Maximum Margin Clustering Made Practical , 2009, IEEE Trans. Neural Networks.

[3]  Rong Jin,et al.  Unsupervised transfer classification: application to text categorization , 2010, KDD.

[4]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[5]  Jue Wang,et al.  Recursive Support Vector Machines for Dimensionality Reduction , 2008, IEEE Transactions on Neural Networks.

[6]  Kenneth Steiglitz,et al.  Combinatorial Optimization: Algorithms and Complexity , 1981 .

[7]  Elad Hazan,et al.  Logarithmic regret algorithms for online convex optimization , 2006, Machine Learning.

[8]  Fei Wang,et al.  Efficient Maximum Margin Clustering via Cutting Plane Algorithm , 2008, SDM.

[9]  Dale Schuurmans,et al.  Unsupervised and Semi-Supervised Multi-Class Support Vector Machines , 2005, AAAI.

[10]  Dale Schuurmans,et al.  Maximum Margin Clustering , 2004, NIPS.

[11]  Yiming Yang,et al.  Topic-conditioned novelty detection , 2002, KDD.

[12]  Thomas Hofmann,et al.  Kernel Methods for Missing Variables , 2005, AISTATS.

[13]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[14]  Alexander J. Smola,et al.  Bundle Methods for Machine Learning , 2007, NIPS.

[15]  Alexander J. Smola,et al.  Bundle Methods for Regularized Risk Minimization , 2010, J. Mach. Learn. Res..

[16]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[17]  Fei Wang,et al.  Efficient multiclass maximum margin clustering , 2008, ICML '08.

[18]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[19]  Rong Jin,et al.  Generalized Maximum Margin Clustering and Unsupervised Kernel Learning , 2006, NIPS.

[20]  Estevam R. Hruschka,et al.  Toward an Architecture for Never-Ending Language Learning , 2010, AAAI.

[21]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[22]  Ayhan Demiriz,et al.  Semi-Supervised Support Vector Machines , 1998, NIPS.

[23]  Xin Liu,et al.  Document clustering with cluster refinement and model selection capabilities , 2002, SIGIR '02.

[24]  David G. Stork,et al.  Pattern Classification , 1973 .

[25]  Estevam R. Hruschka,et al.  Toward Never Ending Language Learning , 2009, AAAI Spring Symposium: Learning by Reading and Learning to Read.

[26]  Carla E. Brodley,et al.  Redefining class definitions using constraint-based clustering: an application to remote sensing of the earth's surface , 2010, KDD.