Customizing information capture and access

This article presents a customizable architecture for software agents that capture and access information in large, heterogeneous, distributed electronic repositories. The key idea is to exploit underlying structure at various levels of granularity to build high-level indices with task-specific interpretations. Information agents construct such indices and are configured as a network of reusable modules called structure detectors and segmenters. We illustrate our architecture with the design and implementation of smart information filters in two contexts: retrieving stock market data from Internet newsgroups and retrieving technical reports from Internet FTP sites.

[1]  Bruce Randall Donald,et al.  On Information Invariants in Robotics , 1995, Artif. Intell..

[2]  Rodney A. Brooks,et al.  A Robust Layered Control Syste For A Mobile Robot , 2022 .

[3]  Daniela Rus,et al.  Transportable Information Agents , 1997, Agents.

[4]  Nicholas J. Belkin,et al.  Information filtering and information retrieval: two sides of the same coin? , 1992, CACM.

[5]  José L. Balcázar,et al.  Structural complexity 1 , 1988 .

[6]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[7]  Bruce Randall Donald,et al.  Minimalism Distribution Supermodularity , 1997, J. Exp. Theor. Artif. Intell..

[8]  John Canny,et al.  A RISC Paradigm for Industrial Robotics , 1993 .

[9]  Stephen Robertson,et al.  The methodology of information retrieval experiment , 1981 .

[10]  Rodney A. Brooks,et al.  Elephants don't play chess , 1990, Robotics Auton. Syst..

[11]  David Sankoff,et al.  Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison , 1983 .

[12]  Pattie Maes,et al.  Agents that reduce work and information overload , 1994, CACM.

[13]  H. Kucera,et al.  Computational analysis of present-day American English , 1967 .

[14]  Luis Gravano,et al.  The Efficacy of GlOSS for the Text Database Discovery Problem , 1993, SIGMOD 1993.

[15]  Keith D. Kotay,et al.  Transportable Agents , 1994 .

[16]  Gerard Salton,et al.  Improving retrieval performance by relevance feedback , 1997, J. Am. Soc. Inf. Sci..

[17]  Daniela Rus,et al.  Using White Space for Automated Document Structuring , 1994 .

[18]  Jacques Cohen,et al.  Concurrent object-oriented programming , 1993, CACM.

[19]  Sargur N. Srihari,et al.  Classification of newspaper image blocks using texture analysis , 1989, Comput. Vis. Graph. Image Process..

[20]  Manuel Blum,et al.  On the power of the compass (or, why mazes are easier to search than graphs) , 1978, 19th Annual Symposium on Foundations of Computer Science (sfcs 1978).

[21]  Michael E. Lesk,et al.  The CORE electronic chemistry library , 1991, SIGIR '91.

[22]  Oren Etzioni,et al.  A softbot-based interface to the Internet , 1994, CACM.

[23]  Masaaki Mizuno,et al.  Document Recognition System with Layout Structure Generator , 1990, MVA.

[24]  Haruo Asada,et al.  Major components of a complete text reading system , 1992 .

[25]  Tom M. Mitchell,et al.  Experience with a learning personal assistant , 1994, CACM.

[26]  Michael R. Genesereth,et al.  Software agents , 1994, CACM.

[27]  Daniela Rus,et al.  Digital Digital Transportable Information Agents Transportable Information Agents , 1996 .

[28]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[29]  Bruce Randall Donald,et al.  Constructive recognizability for task-directed robot programming , 1992, Robotics Auton. Syst..

[30]  Mahesh Viswanathan,et al.  A prototype document image analysis system for technical journals , 1992, Computer.

[31]  José L. Balcázar,et al.  Structural Complexity I , 1995, Texts in Theoretical Computer Science An EATCS Series.

[32]  Devika Subramanian,et al.  Multi-media RISC informatics: retrieving information with simple structural components , 1993, CIKM '93.

[33]  Bart Selman,et al.  Bottom-up design of software agents , 1994, CACM.

[34]  Carl Lagoze,et al.  Dienst: an architecture for distributed document libraries , 1995, CACM.

[35]  Brewster Kahle,et al.  An information system for corporate users: wide area information servers , 1991 .

[36]  Clifford Neuman,et al.  A Comparison of Internet Resource Discovery Approaches ; CU-CS-601-92 , 1992 .

[37]  Daniel P. Huttenlocher,et al.  Comparing Images Using the Hausdorff Distance , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[38]  Jock D. Mackinlay,et al.  Information visualization using 3D interactive animation , 1991, CHI.

[39]  Friedrich M. Wahl,et al.  Document Analysis System , 1982, IBM J. Res. Dev..

[40]  Yasuaki Nakano,et al.  Segmentation methods for character recognition: from segmentation to document structure analysis , 1992, Proc. IEEE.

[41]  Leslie G. Valiant,et al.  Fast probabilistic algorithms for hamiltonian circuits and matchings , 1977, STOC '77.

[42]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[43]  David Kotz,et al.  Autonomous and Adaptive Agents that Gather Information , 1996 .

[44]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[45]  Joseph B. Kruskal,et al.  Time Warps, String Edits, and Macromolecules , 1999 .

[46]  Anil K. Jain,et al.  Address block location on envelopes using Gabor filters , 1992, Pattern Recognit..

[47]  Daniel P. Huttenlocher,et al.  Tracking non-rigid objects in complex scenes , 1993, 1993 (4th) International Conference on Computer Vision.

[48]  Christian Plaunt,et al.  Subtopic structuring for full-length document access , 1993, SIGIR.

[49]  Robert S. Gray,et al.  Agent Tcl: a Exible and Secure Mobile-agent System , 1996 .

[50]  Carl Lagoze,et al.  "Drop-In" Publishing with the World Wide Web , 1995, Comput. Networks ISDN Syst..

[51]  Claudia Pearce,et al.  Generating a dynamic hypertext environment with n-gram analysis , 1993, CIKM '93.

[52]  James R. Munkres,et al.  Topology; a first course , 1974 .