Exploiting Coarse Grained Parallelism in Conceptual Data Mining

A parallel implementation of Ganter’s algorithm to calculate concept lattices for Formal Concept Analysis is presented. A benchmark was executed to experimentally determine the algorithm’s performance, including an AMD Athlon64, Intel dual Xeon, and UltraSPARC T1, with respectively 1, 4, and 24 threads in parallel. Two subsets of Cranfield’s collection were chosen as document set. In addition, the theoretically maximum performance was determined. Due to scheduling problems, the performance of the UltraSPARC was disappointing. Two alternate schedulers are proposed to tackle this problem. It is shown that, given a good scheduler, the algorithm can massively exploit multi-threading architectures and so, substantially reduce the computational burden of Formal Concept Analysis.

[1]  Bernhard Ganter,et al.  Formal Concept Analysis: Mathematical Foundations , 1998 .

[2]  H. S. Heaps,et al.  Information retrieval, computational and theoretical aspects , 1978 .

[3]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[4]  Rudolf Wille,et al.  Restructuring Lattice Theory: An Approach Based on Hierarchies of Concepts , 2009, ICFCA.

[5]  F. A. Grootjen,et al.  Dualistic Ontologies , 2005, Int. J. Intell. Inf. Technol..

[6]  Bernhard Ganter,et al.  Formale Begriffsanalyse - mathematische Grundlagen , 1996 .

[7]  Bernhard Ganter,et al.  Two Basic Algorithms in Concept Analysis , 2010, ICFCA.

[8]  F. A. Grootjen,et al.  Author Identification in Chatlogs using Formal Concept Analysis , 2007 .

[9]  Cyril W. Cleverdon,et al.  Report on the first stage of an investigation into the comparative efficiency of indexing systems , 1960 .

[10]  Ricardo Baeza-Yates,et al.  Block addressing indices for approximate text retrieval , 2000 .

[11]  Gerald Jahoda ASLIB Cranfield Research Project, Report on the First Stage of an Investigation into the Comparative Efficiency of Indexing Systems (Book Review) , 1961 .

[12]  F. A. Grootjen,et al.  NavCo, Navigating the Coneptual Space , 2007 .

[13]  Karl Erich Wolff,et al.  A First Course in Formal Concept Analysis How to Understand Line Diagrams , 2003 .

[14]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[15]  Cyril W. Cleverdon,et al.  The significance of the Cranfield tests on index languages , 1991, SIGIR '91.

[16]  Cyril Cleverdon,et al.  The Cranfield tests on index language devices , 1997 .

[17]  Eduard Hoenkamp,et al.  Unitary Operators on the Document Spac , 2003, J. Assoc. Inf. Sci. Technol..

[18]  F. A. Grootjen A Pragmatic Approach to the Conceptualization of Language , 2004 .

[19]  Gerd Stumme,et al.  Formal Concept Analysis: Theory and Applications , 2004, Journal of universal computer science (Online).