Extracting Threshold Conceptual Structures from Web Documents

In this paper we describe an iterative approach based on formal concept analysis to refine the information retrieval process. Based on weights for ranking documents we define a weighted formal context. We use a Galois connection to introduce a new type of formal concept that allows us to work with specific thresholds for searching words in Web documents. By increasing the threshold, we obtain smaller lattices with more relevant concepts, thus improving the retrieval of more specific items. We use techniques for processing large data sets in parallel, to generate sequences of Galois lattices, overcoming the time complexity of building a lattice for an entire large context.

[1]  L. Beran,et al.  [Formal concept analysis]. , 1996, Casopis lekaru ceskych.

[2]  Rokia Missaoui,et al.  INCREMENTAL CONCEPT FORMATION ALGORITHMS BASED ON GALOIS (CONCEPT) LATTICES , 1995, Comput. Intell..

[3]  Thomas Lukasiewicz Proceedings of the 7th International Symposium on the Foundations of Information and Knowledge Systems‚ FoIKS 2012‚ Kiel‚ Germany‚ March 5−9‚ 2012 , 2000 .

[4]  Dania Egedi,et al.  A Freely Available Wide Coverage Morphological Analyzer for English , 1992, COLING.

[5]  Ollivier Haemmerlé,et al.  Conceptual Structures: Knowledge Visualization and Reasoning, 16th International Conference on Conceptual Structures, ICCS 2008, Toulouse, France, July 7-11, 2008, Proceedings , 2008, ICCS.

[6]  Bernhard Ganter,et al.  Formal Concept Analysis: Mathematical Foundations , 1998 .

[7]  Claudio Carpineto,et al.  Exploiting the Potential of Concept Lattices for Information Retrieval with CREDO , 2004, J. Univers. Comput. Sci..

[8]  Rudolf Wille,et al.  Restructuring Lattice Theory: An Approach Based on Hierarchies of Concepts , 2009, ICFCA.

[9]  Amedeo Napoli,et al.  Querying a Bioinformatic Data Sources Registry with Concept Lattices , 2005, ICCS.

[10]  Bernhard Ganter,et al.  Formal Concept Analysis , 2013 .

[11]  Uta Priss,et al.  Lattice-based information retrieval , 2000 .

[12]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[13]  Nicolas Spyratos,et al.  Preference-Based Query Tuning Through Refinement/Enlargement in a Formal Context , 2006, FoIKS.

[14]  Emanuele Della Valle,et al.  An Introduction to Information Retrieval , 2013 .

[15]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[16]  Gerd Stumme,et al.  Conceptual Structures: Common Semantics for Sharing Knowledge. Proc. , 2005 .

[17]  Julio Gonzalo,et al.  Browsing Search Results via Formal Concept Analysis: Automatic Selection of Attributes , 2004, ICFCA.

[18]  Emmanuel Nauer,et al.  Dynamical Modification of Context for an Iterative and Interactive Information Retrieval Process on the Web , 2007, CLA.

[19]  Derrick G. Kourie,et al.  AddIntent: A New Incremental Algorithm for Constructing Concept Lattices , 2004, ICFCA.

[20]  Claudio Carpineto,et al.  Order-theoretical ranking , 2000, J. Am. Soc. Inf. Sci..

[21]  Víctor Codocedo,et al.  A Contribution to Semantic Indexing and Retrieval Based on FCA - An Application to Song Datasets , 2012, CLA.

[22]  Driss Aboutajdine,et al.  Formal Concept Analysis for Information Retrieval , 2010, ArXiv.

[23]  Peter W. Eklund,et al.  Concept Similarity and Related Categories in SearchSleuth , 2008, ICCS.