HCL: A Specification Language for Hierarchical Text Classification

Hierarchical text classi(cid:2)cation refers to assigning text documents to the categories in a given category tree based on their content. With large number of categories organized as a tree, hierarchical text classi(cid:2)cation helps users to (cid:2)nd information more quickly and accurately. Nevertheless, hierarchical text classi(cid:2)cation methods in the past have often been constructed in a proprietary manner. The construction steps often involve human efforts and are not completely automated. In this paper, we therefore propose a speci(cid:2)cation language known as H CL (Hierarchical Classi(cid:2)cation Lan-guage). H CL is designed to describe a hierarchical classi(cid:2)-cation method including the de(cid:2)nition of a category tree and training of classi(cid:2)ers associated with the categories. Using H CL , a hierarchical classi(cid:2)cation method can be material-ized easily with the help of a method generator system.