A methodology for knowledge acquisition from the web

Accessing up-to-date information in a fast and easy way implies the necessity of information management tools to explore and analyse the huge number of available electronic resources. The Web offers a large amount of valuable information for every possible domain, but its human-oriented representation and its size makes difficult and extremely time consuming any kind of centralised computer-based processing. In this paper, a combination of distributed AI and knowledge acquisition techniques is proposed to tackle this problem. In particular, we have designed an incremental and domain independent learning methodology modelled over a multi-agent system that crawls the Web composing knowledge structures (ontologies) from the interrelation of several automatically obtained taxonomies of terms according to the user’s interests. Moreover, the obtained ontologies are used to represent, in a structured way, the currently available web resources for the corresponding domain. The paper also presents examples of the potential results over medical and technological domains and compares the results, whenever it is possible, against publicly available taxonomic web search engines obtaining, in all cases, a considerable improvement.

