Text Mining Scientific Papers: A Survey on FCA-Based Information Retrieval Research

Formal Concept Analysis (FCA) is an unsupervised clustering technique and many scientific papers are devoted to applying FCA in Information Retrieval (IR) research. We collected 103 papers published between 2003-2009 which mention FCA and information retrieval in the abstract, title or keywords. Using a prototype of our FCA-based toolset CORDIET, we converted the pdf-files containing the papers to plain text, indexed them with Lucene using a thesaurus containing terms related to FCA research and then created the concept lattice shown in this paper. We visualized, analyzed and explored the literature with concept lattices and discovered multiple interesting research streams in IR of which we give an extensive overview. The core contributions of this paper are the innovative application of FCA to the text mining of scientific papers and the survey of the FCA-based IR research.

[1]  Peter W. Eklund,et al.  Concept Lattices for Information Visualization: Can Novices Read Line-Diagrams? , 2004, ICFCA.

[2]  Gerd Stumme,et al.  Document retrieval for e-mail search and discovery using formal concept analysis , 2003, Appl. Artif. Intell..

[3]  Bernhard Ganter,et al.  Formal Concept Analysis: Mathematical Foundations , 1998 .

[4]  Aoying Zhou,et al.  Concept-Based Retrieval of Alternate Web Services , 2005, DASFAA.

[5]  Rudolf Wille,et al.  Restructuring Lattice Theory: An Approach Based on Hierarchies of Concepts , 2009, ICFCA.

[6]  Jon Ducrou DVDSleuth: A Case Study in Applied Formal Concept Analysis for Navigating Web Catalogs , 2007, ICCS.

[7]  Gerardo Canfora,et al.  An approach to support Web service classification and annotation , 2005, 2005 IEEE International Conference on e-Technology, e-Commerce and e-Service.

[8]  Emmanuel Nauer,et al.  CreChainDo: an iterative and interactive Web information retrieval system based on lattices , 2009, Int. J. Gen. Syst..

[9]  Hamid Mcheick,et al.  Another nail to the coffin of faceted controlled-vocabulary component classification and retrieval , 1997, SSR '97.

[10]  Jonas Poelmans,et al.  Curbing domestic violence: instantiating C-K theory with formal concept analysis and emergent self-organizing maps , 2010, Intell. Syst. Account. Finance Manag..

[11]  Xin Peng,et al.  An Incremental and FCA-Based Ontology Construction Method for Semantics-Based Component Retrieval , 2007, Seventh International Conference on Quality Software (QSIC 2007).

[12]  Peter W. Eklund,et al.  SearchSleuth: The Conceptual Neighbourhood of an Web Query , 2007, CLA.

[13]  Robert Godin,et al.  Design of a browsing interface for information retrieval , 1989, SIGIR '89.

[14]  Julien Tane,et al.  Query-Based Multicontexts for Knowledge Base Browsing: An Evaluation , 2006, ICCS.

[15]  Peter W. Eklund,et al.  Browsing Semi-structured Web Texts Using Formal Concept Analysis , 2001, ICCS.

[16]  Peter W. Eklund,et al.  Dynamic Schema Navigation Using Formal Concept Analysis , 2005, DaWaK.

[17]  Oscar Díaz,et al.  Tool Support , 1999, Active Rules in Database Systems.

[18]  Pascal Hitzler,et al.  Querying Formal Contexts with Answer Set Programs , 2006, ICCS.

[19]  Amedeo Napoli,et al.  Extending Attribute Dependencies for Lattice-Based Querying and Navigation , 2008, ICCS.

[20]  Bernhard Ganter,et al.  Formal Concept Analysis, 6th International Conference, ICFCA 2008, Montreal, Canada, February 25-28, 2008, Proceedings , 2008, International Conference on Formal Concept Analysis.

[21]  Peter W. Eklund,et al.  Concept Similarity and Related Categories in SearchSleuth , 2008, ICCS.

[22]  I Ignatov Dmitry,et al.  Frequent Itemset Mining for Clustering Near Duplicate Web Documents , 2009 .

[23]  Bjoern Koester,et al.  Conceptual Knowledge Retrieval with FooCA: Improving Web Search Engine Results with Contexts and Concept Hierarchies , 2006, ICDM.

[24]  Paolo Ceravolo,et al.  An FCA-based mapping generator , 2007, 2007 IEEE Conference on Emerging Technologies and Factory Automation (EFTA 2007).

[25]  Julio Gonzalo,et al.  Browsing Search Results via Formal Concept Analysis: Automatic Selection of Attributes , 2004, ICFCA.

[26]  Rokia Missaoui,et al.  Experimental Comparison of Navigation in a Galois Lattice with Conventional Information Retrieval Methods , 1993, Int. J. Man Mach. Stud..

[27]  Amedeo Napoli,et al.  Querying a Bioinformatic Data Sources Registry with , 2005 .

[28]  Paul Compton,et al.  Evolutionary document management and retrieval for specialized domains on the web , 2004, Int. J. Hum. Comput. Stud..

[29]  Jonas Poelmans,et al.  Formal Concept Analysis in Knowledge Discovery: A Survey , 2010, ICCS.

[30]  Denys Poshyvanyk,et al.  Combining Formal Concept Analysis with Information Retrieval for Concept Location in Source Code , 2007, 15th IEEE International Conference on Program Comprehension (ICPC '07).

[31]  Gerd Stumme,et al.  Efficient Mining of Association Rules Based on Formal Concept Analysis , 2005, Formal Concept Analysis.

[32]  Carlo Meghini,et al.  Faceted Content-Based Image Retrieval , 2008, 2008 19th International Workshop on Database and Expert Systems Applications.

[33]  Peter W. Eklund,et al.  Restructuring Help Systems Using Formal Concept Analysis , 2005, ICFCA.

[34]  Peter W. Eklund,et al.  Semantology as Basis for Conceptual Knowledge Processing , 2007, ICFCA.

[35]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[36]  Jonas Poelmans,et al.  Concept Discovery Innovations in Law Enforcement: A Perspective , 2010, 2010 International Conference on Intelligent Networking and Collaborative Systems.

[37]  Géraldine Polaillon,et al.  FCA for contextual semantic navigation and information retrieval in heterogeneous information systems , 2007 .

[38]  Sergei O. Kuznetsov,et al.  Frequent Itemset Mining for Clustering Near Duplicate Web Documents , 2009, ICCS.

[39]  Uta Priss,et al.  Lattice-based information retrieval , 2000 .

[40]  Rudolf Wille,et al.  Methods of Conceptual Knowledge Processing , 2006, ICFCA.

[41]  Imran Ahmad,et al.  Old fashion text-based image retrieval using FCA , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[42]  Bénédicte Le Grand,et al.  Semantic and Conceptual Context-Aware Information Retrieval , 2009, SITIS.

[43]  Claudio Carpineto,et al.  Concept data analysis - theory and applications , 2004 .

[44]  Nicolas Spyratos,et al.  Preference-Based Query Tuning Through Refinement/Enlargement in a Formal Context , 2006, FoIKS.

[45]  Joseph G. Davis,et al.  Using an Aligned Ontology to Process User Queries , 2004, AIMSA.

[46]  Peter W. Eklund,et al.  Concept similarity and related categories in information retrieval using formal concept analysis , 2012, Int. J. Gen. Syst..

[47]  Claudio Carpineto,et al.  A lattice conceptual clustering system and its application to browsing retrieval , 2004, Machine Learning.

[48]  Thomas Tilley Tool Support for FCA , 2004, ICFCA.

[49]  Nenad Stojanovic,et al.  On using query neighbourhood for better navigation through a product catalog: SMART approach , 2004, IEEE International Conference on e-Technology, e-Commerce and e-Service, 2004. EEE '04. 2004.

[50]  Ming-Wen Shao,et al.  Reduction method for concept lattices based on rough set theory and its application , 2007, Comput. Math. Appl..

[51]  Julio Gonzalo,et al.  Automatic Selection of Noun Phrases as Document Descriptors in an FCA-Based Information Retrieval System , 2005, ICFCA.

[52]  Bernhard Ganter,et al.  Formal Concept Analysis , 2013 .

[53]  Yun Zhang,et al.  A New Search Results Clustering Algorithm Based on Formal Concept Analysis , 2008, 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery.

[54]  Ali Jaoua,et al.  Using Formal Concept Analysis for Heterogeneous Information Retrieval , 2005, CLA.

[55]  Peter W. Eklund,et al.  Citation Analysis using Formal Concept Analysis: A case study in Software Engineering , 2007, 18th International Workshop on Database and Expert Systems Applications (DEXA 2007).

[56]  Nenad Stojanovic On the Query Refinement in the Ontology-Based Searching for Information , 2003, CAiSE.

[57]  Claudio Carpineto,et al.  Exploiting the Potential of Concept Lattices for Information Retrieval with CREDO , 2004, J. Univers. Comput. Sci..

[58]  Claudio Carpineto,et al.  Using Concept Lattices for Text Retrieval and Mining , 2005, Formal Concept Analysis.

[59]  Amedeo Napoli,et al.  Querying a Bioinformatic Data Sources Registry with Concept Lattices , 2005, ICCS.

[60]  Paul Compton,et al.  A Hybrid Browsing Mechanism Using Conceptual Scales , 2006, PKAW.

[61]  Peter W. Eklund,et al.  FCA-Based Browsing and Searching of a Collection of Images , 2006, ICCS.

[62]  Marco Antonio Gómez-Martín,et al.  Improving Annotation in the Semantic Web and Case Authoring in Textual CBR , 2006, ECCBR.

[63]  Sarun Intakosum,et al.  Retrieving design patterns by case-based reasoning and Formal Concept Analysis , 2009, 2009 2nd IEEE International Conference on Computer Science and Information Technology.

[64]  Peter W. Eklund,et al.  Navigation and Annotation with Formal Concept Analysis , 2008, PKAW.

[65]  Uta Priss,et al.  Formal concept analysis in information science , 2006, Annu. Rev. Inf. Sci. Technol..

[66]  Hanene Chettaoui,et al.  Cooperative Answering of Fuzzy Queries , 2009, Journal of Computer Science and Technology.

[67]  Udo Kruschwitz,et al.  Automatically Maintained Domain Knowledge: Initial Findings , 2009, ECIR.