论文信息 - Maximum-Margin Framework for Training Data Synchronization in Large-Scale Hierarchical Classification

Maximum-Margin Framework for Training Data Synchronization in Large-Scale Hierarchical Classification

In the context of supervised learning, the training data for large-scale hierarchical classification consist of (i) a set of input-output pairs, and (ii) a hierarchy structure defining parent-child relation among class labels. It is often the case that the hierarchy structure given a-priori is not optimal for achieving high classification accuracy. This is especially true for web-taxonomies such as Yahoo! directory which consist of tens of thousand of classes. Furthermore, an important goal of hierarchy design is to render better navigability and browsing. In this work, we propose a maximum-margin framework for automatically adapting the given hierarchy by using the set of input-output pairs to yield a new hierarchy. The proposed method is not only theoretically justified but also provides a more principled approach for hierarchy flattening techniques proposed earlier, which are ad-hoc and empirical in nature. The empirical results on publicly available large-scale datasets demonstrate that classification with new hierarchy leads to better or comparable generalization performance than the hierarchy flattening techniques.

[1] Hassan H. Malik. Improving Hierarchical SVMs by Hierarchy Flattening and Lazy Classification , 2010 .

[2] Qiang Yang,et al. Deep classification in large-scale text hierarchies , 2008, SIGIR '08.

[3] Ioannis Partalas,et al. Adaptive Classifier Selection in Large-Scale Hierarchical Classification , 2012, ICONIP.

[4] Daphne Koller,et al. Discriminative learning of relaxed hierarchy for large-scale visual recognition , 2011, 2011 International Conference on Computer Vision.

[5] Yoram Singer,et al. Large margin hierarchical classification , 2004, ICML.

[6] Paul N. Bennett,et al. Refined experts: improving classification in large taxonomies , 2009, SIGIR.

[7] Yiming Yang,et al. A re-examination of text categorization methods , 1999, SIGIR '99.

[8] Yiming Yang,et al. Support vector machines classification with a very large-scale taxonomy , 2005, SKDD.

[9] Ofer Dekel,et al. Distribution-Calibrated Hierarchical Classification , 2009, NIPS.

[10] Thomas Hofmann,et al. Hierarchical document categorization with support vector machines , 2004, CIKM '04.

[11] Xiaolin Wang,et al. Flatten hierarchies for large-scale hierarchical text categorization , 2010, 2010 Fifth International Conference on Digital Information Management (ICDIM).