AGATHE-2: An Adaptive, Ontology-Based Information Gathering Multi-Agent System for Restricted Web Domains

Due to Web size and diversity of information, relevant information gathering on the Web turns out to be a highly complex task. The main problem with most information retrieval approaches is neglecting pages’ context, given their inner deficiency: search engines are based on keyword indexing, which cannot capture context. Considering restricted domains, taking into account contexts, with the use of domain ontology, may lead to more relevant and accurate information gathering. In the last years, we have conducted research with this hypothesis, and proposed an agentand ontology-based restricteddomain cooperative information gathering approach accordingly, that can be instantiated in information gathering systems for specific domains, such as academia, tourism, etc. In this chapter, the authors present this approach, a generic software architecture, named AGATHE-2, which is a full-fledged scalable multi-agent system. Besides offering an in-depth treatment for these domains due to the use of domain ontology, this new version uses machine learning techniques over linguistic information in order to accelerate the knowledge acquisition necessary for the task of information extraction over the Web pages. AGATHE-2 is an agent and ontology-based system that collects and classifies relevant Web pages about

[1]  Frank G. Goethals,et al.  Editorial Preface : International Journal of E-Business Research , 2022 .

[2]  David Kauchak,et al.  Sources of Success for Boosted Wrapper Induction , 2004, J. Mach. Learn. Res..

[3]  Dianne Waddell,et al.  E-Business Innovations and Change Management , 2004, J. Res. Pract. Inf. Technol..

[4]  Sébastien Fournier,et al.  Combining agents and Wrapper Induction for information gathering on restricted web domains , 2010, 2010 Fourth International Conference on Research Challenges in Information Science (RCIS).

[5]  Adam Pease,et al.  Formal Ontology for Media Rights Transactions , 2009 .

[6]  Victor R. Lesser,et al.  Cooperative information-gathering: a distributed problem-solving approach , 1997, IEE Proc. Softw. Eng..

[7]  Henrik Eriksson,et al.  Using JessTab to Integrate Protégé and Jess , 2003, IEEE Intell. Syst..

[8]  N. Kushmerik Gleaning the Web , 1999, IEEE Intell. Syst..

[9]  Oren Etzioni,et al.  Open Information Extraction from the Web , 2007, CACM.

[10]  Bernard Espinasse,et al.  An adaptive information extraction system based on wrapper induction with POS tagging , 2010, SAC '10.

[11]  Dianne Waddell Ethics and E-Business: An Oxymoron? , 2004 .

[12]  Doug Downey,et al.  Web-scale information extraction in knowitall: (preliminary results) , 2004, WWW '04.

[13]  Allen Newell,et al.  Report on a general problem-solving program , 1959, IFIP Congress.

[14]  Thomas R. Gruber,et al.  Toward principles for the design of ontologies used for knowledge sharing? , 1995, Int. J. Hum. Comput. Stud..

[15]  Antonio Picariello,et al.  An intelligent search agent system for semantic information retrieval on the internet , 2003, WIDM '03.

[16]  Andrew McCallum,et al.  Building Domain-Specific Search Engines with Machine Learning Techniques , 1999 .

[17]  Hyacinth S. Nwana,et al.  Software agents: an overview , 1996, The Knowledge Engineering Review.

[18]  Craig A. Knoblock,et al.  STALKER: Learning Extraction Rules for Semistructured, Web-based Information Sources * , 1998 .

[19]  Dayne Freitag,et al.  Boosted Wrapper Induction , 2000, AAAI/IAAI.

[20]  Guilherme Bittencourt,et al.  An Ontology-based Architecture for Cooperative Information Agents , 2003, IJCAI.

[21]  Roberto Garcia Semantic Web for Business: Cases and Applications , 2008 .

[22]  Enrico Blanzieri,et al.  Implicit: a recommender system that uses implicit knowledge to produce suggestions , 2005 .

[23]  Fabio Ciravegna,et al.  Evaluating machine learning for information extraction , 2005, ICML.

[24]  Sébastien Fournier,et al.  Agent and ontology based information gathering on restricted web domains with AGATHE , 2008, SAC '08.

[25]  Emmanuel Cartier,et al.  Demonstration of the CROSSMARC System , 2003, HLT-NAACL.

[26]  Nicolas Lhuillier,et al.  FOUNDATION FOR INTELLIGENT PHYSICAL AGENTS , 2003 .

[27]  Diana Maynard,et al.  Metrics for Evaluation of Ontology-based Information Extraction , 2006, EON@WWW.

[28]  Sébastien Fournier,et al.  AGATHE: An Agent- and Ontology-Based System for Gathering Information about Restricted Web Domains , 2009, Int. J. E Bus. Res..

[29]  Craig A. Knoblock,et al.  New Directions: Agents for Information Gathering , 1997, IEEE Expert.

[30]  Sheng-Uei Guan,et al.  E-Commerce Agents and Payment Systems , 2009 .

[31]  Jie Tang,et al.  Information Extraction: Methodologies and Applications , 2008 .

[32]  Enrico Blanzieri,et al.  Implicit: an agent-based recommendation system for web search , 2005, AAMAS '05.

[33]  Peter Siniakov,et al.  An Overview and Classification of Adaptive Approaches to Information Extraction , 2005, J. Data Semant..

[34]  Jason J. Jung Ontological framework based on contextual mediation for collaborative information retrieval , 2007, Information Retrieval.

[35]  Dan Tufis,et al.  Tagging romanian texts: a case study for QTAG, a language independent probabilistic tagger , 1998 .

[36]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[37]  Alicia Ageno,et al.  Adaptive information extraction , 2006, CSUR.

[38]  Victor R. Lesser,et al.  BIG: An agent for resource-bounded information gathering and decision making , 2000, Artif. Intell..