A topic-specific crawler with dynamic concept context graph based on FCA

An excellent topic-specific crawler should retrieve as many high related web pages as possible in the limited time. The historical clicked web pages can express the users’ interest in a certain extent. By using the knowledge of Formal Concept Analysis (FCA), some information is extracted out to construct the concept lattice which is used to build the Concept Context Graph (CCG). In this paper, we construct a knowledge background for the topic-specific crawler by using the relevant web pages. And a Dynamic CCG (DCCG) is proposed in this paper which updates the concepts in a way of elimination mechanism all the crawling process. At last, several different CCGs are taken into the experiment for the comparison in their performance.