User Oriented Semi-Supervised Document Clustering

In many text mining applications, it is needed to cluster documents according to demand of users. However, Traditional documents clustering that use unsupervised learning are not able to meet this demand. In this paper, a new clustering approach that focuses on the problem is proposed. Main contributions include: (1) Expresses user requirement by topic with multiple attributes (2) Annotates topic semantic by ontology, calculate dissimilarity between topic semantics and build dissimilarity matrix. Experiments show that new approach is effective.