Cognitive Multi-agent Systems for Integrated Information Retrieval and Extraction over the Web

In the Web, there are classes of pages with similar structuring and contents (e.g., call for papers pages, references, etc), which are interrelated forming clusters (e.g., Science). We propose an architecture of cognitive multiagent systems for information retrieval and extraction from these clusters. Each agent processes one class employing reusable ontologies to recognize pages, extract all possible useful information and communicate with the others agents. Whenever it identifies information interesting to another agent, it forwards this information to that agent. These "hot hints" usually contain much less garbage than search engine results do. The agent architecture presents many sorts of reuse: all the code, DB definitions, knowledge and services of the search engines. We got promising results using Java and Jess.

[1]  Ellen Riloff,et al.  Information extraction as a basis for high-precision text classification , 1994, TOIS.

[2]  E. J. Friedman-hill,et al.  Jess, the Java expert system shell , 1997 .

[3]  David W. Embley,et al.  Ontology-based extraction and structuring of information from data-rich unstructured documents , 1998, CIKM '98.

[4]  Douglas E. Appelt,et al.  Introduction to Information Extraction Technology , 1999, IJCAI 1999.

[5]  Munindar P. Singh,et al.  Agents on the Web: The Agent Test , 1997, IEEE Internet Comput..

[6]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[7]  Craig A. Knoblock,et al.  New Directions: Agents for Information Gathering , 1997, IEEE Expert.

[8]  Michael R. Genesereth,et al.  Software agents , 1994, CACM.

[9]  Victor R. Lesser,et al.  Cooperative information-gathering: a distributed problem-solving approach , 1997, IEE Proc. Softw. Eng..

[10]  Ramana Rao,et al.  Silk from a sow's ear: extracting usable structures from the Web , 1996, CHI.

[11]  Craig A. Knoblock,et al.  Wrapper generation for semi-structured Internet sources , 1997, SGMD.

[12]  Guilherme Bittencourt,et al.  In the Quest of the Missing Link , 1997, IJCAI.

[13]  David Flanagan,et al.  Java examples in a nutshell , 1997 .

[14]  Ellen Riloff Information extraction as a basis for portable text classification systems , 1994 .

[15]  William W. Cohen Learning Rules that Classify E-Mail , 1996 .

[16]  Tom M. Mitchell,et al.  Learning to Extract Symbolic Knowledge from the World Wide Web , 1998, AAAI/IAAI.