Web documents clustering with interest links

Web documents clustering is a kind of effective Web mining technique. This paper proposes a novel Web documents clustering algorithm from the perspective of Web usage through analyzing WWW cache, in which Web documents reflect user's recent interests. According to the rich semantic information embedded in hyperlinks in Web documents, we first extracts hyperlinks from Web documents and the Web documents in WWW cache is modeled as an undirected Web graph in our approach. Then the clustering algorithm based on the Web graph model is given. Finally, Experimental results verify that the algorithm is efficient and feasible.