Spatial Indexing for Scalability in FCA

The paper provides evidence that spatial indexing structures offer faster resolution of Formal Concept Analysis queries than B-Tree/Hash methods. We show that many Formal Concept Analysis operations, computing the contingent and extent sizes as well as listing the matching objects, enjoy improved performance with the use of spatial indexing structures such as the RD-Tree. Speed improvements can vary up to eighty times faster depending on the data and query. The motivation for our study is the application of Formal Concept Analysis to Semantic File Systems. In such applications millions of formal objects must be dealt with. It has been found that spatial indexing also provides an effective indexing technique for more general purpose applications requiring scalability in Formal Concept Analysis systems. The coverage and benchmarking are presented with general applications in mind.

[1]  Pierre Jouvelot,et al.  Semantic file systems , 1991, SOSP '91.

[2]  Claudio Carpineto,et al.  Concept data analysis - theory and applications , 2004 .

[3]  B. M. Martin File system wide file classification with agents , 2003 .

[4]  Ben Martin,et al.  Formal Concept Analysis and Semantic File Systems , 2004, ICFCA.

[5]  Joseph M. Hellerstein,et al.  THE RD-TREE: AN INDEX STRUCTURE FOR SETS , 1997 .

[6]  StummeGerd,et al.  Computing iceberg concept lattices with TITANIC , 2002 .

[7]  Feng Gao,et al.  Towards Generic Pattern Mining , 2005, ICFCA.

[8]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[9]  Heikki Mannila,et al.  Fast Discovery of Association Rules , 1996, Advances in Knowledge Discovery and Data Mining.

[10]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[11]  Peter W. Eklund,et al.  Analyzing an Email Collection Using Formal Concept Analysis , 1999, PKDD.

[12]  Jan Komorowski,et al.  Principles of Data Mining and Knowledge Discovery , 2001, Lecture Notes in Computer Science.

[13]  Olivier Ridoux,et al.  A Logic File System , 2003, USENIX Annual Technical Conference, General Track.

[14]  Gerd Stumme,et al.  CEM - A Conceptual Email Manager , 2000, ICCS.

[15]  Dan Tow SQL Tuning , 2003 .

[16]  Peter W. Eklund,et al.  Browsing Semi-structured Web Texts Using Formal Concept Analysis , 2001, ICCS.

[17]  Susanne Prediger,et al.  Logical Scaling in Formal Concept Analysis , 1997, ICCS.

[18]  Sven Helmer,et al.  Index Structures for Databases Containing Data Items with Set-valued Attributes , 1997 .

[19]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[20]  Bernhard Ganter,et al.  Formal Concept Analysis: Mathematical Foundations , 1998 .

[21]  Michael J. Folk File Structures , 1987 .

[22]  Yon Dohn Chung,et al.  The RD-Tree: a structure for processing Partial-MAX/MIN queries in OLAP , 2002, Inf. Sci..

[23]  R. Wille,et al.  Ein TOSCANA-Erkundungssystem zur Literatursuche , 2000 .

[24]  Gerd Stumme,et al.  Conceptual Structures: Broadening the Base , 2001, Lecture Notes in Computer Science.