Regularization Framework for Large Scale Hierarchical Classification

In this paper, we propose a hierarchical regularization framework for large-scale hierarchical classification. In our framework, we use the regularization structure to share information across the hierarchy and enforce similarity between class-parameters that are located nearby in the hierarchy. To address the computational issues that arise, we propose a parallel-iterative optimization scheme that can handle large-scale problems with tens of thousands of classes and hundreds of thousands of instances. Experiments on multiple benchmark datasets showed significant performance improvements of our proposed approach over other competing approaches.

[1]  T. Minka A comparison of numerical optimizers for logistic regression , 2004 .

[2]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2007, ICML '07.

[3]  Radford M. Neal,et al.  Improving Classification When a Class Hierarchy is Available Using a Hierarchy-Based Prior , 2005, math/0510449.

[4]  Qiang Yang,et al.  Deep classification in large-scale text hierarchies , 2008, SIGIR '08.

[5]  Tom M. Mitchell,et al.  Improving Text Classification by Shrinkage in a Hierarchy of Classes , 1998, ICML.

[6]  Christopher DeCoro,et al.  Bayesian Aggregation for Hierarchical Genre Classification , 2007, ISMIR.

[7]  Jason Weston,et al.  Label Embedding Trees for Large Multi-Class Tasks , 2010, NIPS.

[8]  Daphne Koller,et al.  Hierarchically Classifying Documents Using Very Few Words , 1997, ICML.

[9]  Yiming Yang,et al.  Support vector machines classification with a very large-scale taxonomy , 2005, SKDD.

[10]  Chih-Jen Lin,et al.  A dual coordinate descent method for large-scale linear SVM , 2008, ICML '08.

[11]  Alex A. Freitas,et al.  A survey of hierarchical classification across different application domains , 2010, Data Mining and Knowledge Discovery.

[12]  Paul N. Bennett,et al.  Refined experts: improving classification in large taxonomies , 2009, SIGIR.

[13]  Susan T. Dumais,et al.  Hierarchical classification of Web content , 2000, SIGIR '00.

[14]  Yiming Yang,et al.  An Evaluation of Statistical Approaches to Text Categorization , 1999, Information Retrieval.

[15]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[16]  Claudio Gentile,et al.  Incremental Algorithms for Hierarchical Classification , 2004, J. Mach. Learn. Res..

[17]  Thomas Hofmann,et al.  Hierarchical document categorization with support vector machines , 2004, CIKM '04.

[18]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[19]  Lin Xiao,et al.  Hierarchical Classification via Orthogonal Transfer , 2011, ICML.

[20]  Yoram Singer,et al.  Large margin hierarchical classification , 2004, ICML.

[21]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[22]  Joydeep Ghosh,et al.  Enhanced hierarchical classification via isotonic smoothing , 2008, WWW.