Exploiting the Potential of Concept Lattices for Information Retrieval with CREDO

The recent advances in Formal Concept Analysis (FCA) together with the major changes faced by modern Information Retrieval (IR) provide new unprecedented challenges and opportunities for FCA-based IR applications. The main advantage of FCA for IR is the possibility of creating a conceptual representation of a given docu- ment collection in the form of a document lattice, which may be used both to improve the retrieval of specific items and to drive the mining of the collection's contents. In this paper, we will examine the best features of FCA for solving IR tasks that could not be easily addressed by conventional systems, as well as the most critical aspects for building FCA-based IR applications. These observations have led to the development of CREDO, a system that allows the user to query Web documents and see retrieval results organized in a browsable concept lattice. This is the second major focus of the paper. We will show that CREDO is especially useful for quickly locating the docu- ments corresponding to the meaning of interest among those retrieved in response to an ambiguous query, or for mining the contents of the documents that reference a given entity. An on-line version of the system is available for testing at http://credo.fub.it.

[1]  Claudio Carpineto,et al.  An information-theoretic approach to automatic query expansion , 2001, TOIS.

[2]  Claudio Carpineto,et al.  Effective Reformulation of Boolean Queries with Concept Lattices , 1998, FQAS.

[3]  Peter Willett,et al.  Recent trends in hierarchic document clustering: A critical review , 1988, Inf. Process. Manag..

[4]  Gerd Stumme,et al.  Local Scaling in Conceptual Data Systems , 1996, ICCS.

[5]  Uta Priss A Graphical Interface for Document Retrieval Based on Formal Concept Analysis , 2002 .

[6]  Claudio Carpineto,et al.  Concept data analysis - theory and applications , 2004 .

[7]  John D. Lafferty,et al.  A study of smoothing methods for language models applied to Ad Hoc information retrieval , 2001, SIGIR '01.

[8]  Gerd Stumme,et al.  CEM - A Conceptual Email Manager , 2000, ICCS.

[9]  Claudio Carpineto,et al.  Information retrieval through hybrid navigation of lattice representations , 1996, Int. J. Hum. Comput. Stud..

[10]  Dagobert Soergel,et al.  Mathematical analysis of documentation systems : An attempt to a theory of classification and search request formulation , 1967, Inf. Storage Retr..

[11]  Gerd Stumme,et al.  Document retrieval for e-mail search and discovery using formal concept analysis , 2003, Appl. Artif. Intell..

[12]  Claudio Carpineto,et al.  Improving retrieval feedback with multiple term-ranking function combination , 2002, TOIS.

[13]  C. J. van Rijsbergen,et al.  Probabilistic models of information retrieval based on measuring the divergence from randomness , 2002, TOIS.

[14]  Bernhard Ganter,et al.  Formal Concept Analysis: Mathematical Foundations , 1998 .

[15]  Claudio Carpineto,et al.  Dynamically Bounding Browsable Retrieval Spaces: an Application to Galois Lattices , 1994, RIAO.

[16]  Udi Manber,et al.  Integrating content-based access mechanisms with hierarchical file systems , 1999, OSDI '99.

[17]  Rudolf Wille,et al.  Line diagrams of hierarchical concept systems , 1984 .

[18]  Christian Lindig Concept-Based Component Retrieval , 1995 .

[19]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[20]  Mark Sanderson,et al.  Advantages of query biased summaries in information retrieval , 1998, SIGIR '98.

[21]  Amanda Spink,et al.  Interaction in Information Retrieval: Selection and Effectiveness of Search Terms , 1997, J. Am. Soc. Inf. Sci..

[22]  Robert Godin,et al.  Lattice model of browsable data spaces , 1986, Inf. Sci..

[23]  Keith Duncan,et al.  Cognitive Engineering , 2017, Encyclopedia of GIS.

[24]  Peter W. Eklund,et al.  Browsing Semi-structured Web Texts Using Formal Concept Analysis , 2001, ICCS.

[25]  Gail E. Kaiser,et al.  An Information Retrieval Approach For Automatically Constructing Software Libraries , 1991, IEEE Trans. Software Eng..

[26]  Robert Godin,et al.  Design of a browsing interface for information retrieval , 1989, SIGIR '89.

[27]  Sergei O. Kuznetsov,et al.  Comparing performance of algorithms for generating concept lattices , 2002, J. Exp. Theor. Artif. Intell..

[28]  Claudio Carpineto,et al.  A lattice conceptual clustering system and its application to browsing retrieval , 2004, Machine Learning.

[29]  Rokia Missaoui,et al.  Experimental Comparison of Navigation in a Galois Lattice with Conventional Information Retrieval Methods , 1993, Int. J. Man Mach. Stud..

[30]  Pierre Jouvelot,et al.  Semantic file systems , 1991, SOSP '91.

[31]  Derrick G. Kourie,et al.  Compressed pseudo-lattices , 2002, J. Exp. Theor. Artif. Intell..

[32]  Allen Newell,et al.  The psychology of human-computer interaction , 1983 .

[33]  Rokia Missaoui,et al.  INCREMENTAL CONCEPT FORMATION ALGORITHMS BASED ON GALOIS (CONCEPT) LATTICES , 1995, Comput. Intell..

[34]  Olivier Ridoux,et al.  A File System Based on Concept Analysis , 2000, Computational Logic.

[35]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[36]  Frank Tip,et al.  Reengineering class hierarchies using concept analysis , 1998, SIGSOFT '98/FSE-6.

[37]  Dania Egedi,et al.  A freely available wide coverage morphological analyzer for English , 1992, COLING 1992.

[38]  Anselm Spoerri InfoCrystal: Integrating Exact and Partial Matching Approaches through Visualization , 1994, RIAO.

[39]  Claudio Carpineto,et al.  GALOIS: An Order-Theoretic Approach to Conceptual Clustering , 1993, ICML.

[40]  Claudio Carpineto,et al.  Effectiveness of keyword-based display and selection of retrieval results for interactive searches , 2000, International Journal on Digital Libraries.

[41]  Frank Vogt,et al.  Data Analysis Based on a Conceptual File , 1991 .

[42]  Claudio Carpineto,et al.  Order-theoretical ranking , 2000, J. Am. Soc. Inf. Sci..

[43]  Hafedh Mili,et al.  Building and maintaining analysis-level class hierarchies using Galois Lattices , 1993, OOPSLA '93.

[44]  R. Wille,et al.  Ein TOSCANA-Erkundungssystem zur Literatursuche , 2000 .

[45]  Alexander Dekhtyar,et al.  Information Retrieval , 2018, Lecture Notes in Computer Science.

[46]  Dario Lucarella,et al.  MORE: Multimedia Object Retrieval Environment , 1993, Hypertext.

[47]  Loren G. Terveen,et al.  Finding and visualizing inter-site clan graphs , 1998, CHI.

[48]  Stuart K. Card,et al.  Information visualization tutorial , 1997, CHI Extended Abstracts.

[49]  L. Beran,et al.  [Formal concept analysis]. , 1996, Casopis lekaru ceskych.

[50]  Lhouari Nourine,et al.  A Fast Algorithm for Building Lattices , 1999, Inf. Process. Lett..

[51]  Frank Vogt,et al.  TOSCANA - a Graphical Tool for Analyzing and Exploring Data , 1994, GD.

[52]  Gert Schmeltz Pedersen A browser for bibliographic information retrieval, based on an application of lattice theory , 1993, SIGIR.

[53]  Fabio Crestani,et al.  Mobile delivery of news using hierarchical query-biased summaries , 2002, SAC '02.

[54]  Claudio Carpineto,et al.  ULYSSES: A Lattice-Based Multiple Interaction Strategy Retrieval Interface , 1995, EWHCI.

[55]  Oren Etzioni,et al.  Grouper: A Dynamic Clustering Interface to Web Search Results , 1999, Comput. Networks.

[56]  Christian Lindig Fast Concept Analysis , 2000 .

[57]  Stephen E. Robertson,et al.  Okapi at TREC-7: Automatic Ad Hoc, Filtering, VLC and Interactive , 1998, TREC.

[58]  J. Bordat Calcul pratique du treillis de Galois d'une correspondance , 1986 .

[59]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[60]  Fabio Crestani,et al.  Automatic authoring and construction of hypermedia for information retrieval , 1995, Multimedia Systems.

[61]  Uta Priss,et al.  Lattice-based information retrieval , 2000 .