Distributed Formal Concept Analysis Algorithms Based on an Iterative MapReduce Framework

While many existing formal concept analysis algorithms are efficient, they are typically unsuitable for distributed implementation. Taking the MapReduce (MR) framework as our inspiration we introduce a distributed approach for performing formal concept mining. Our method has its novelty in that we use a light-weight MapReduce runtime called Twister which is better suited to iterative algorithms than recent distributed approaches. First, we describe the theoretical foundations underpinning our distributed formal concept analysis approach. Second, we provide a representative exemplar of how a classic centralized algorithm can be implemented in a distributed fashion using our methodology: we modify Ganter's classic algorithm by introducing a family of $\mbox{MR}^\star$ algorithms, namely MRGanter and MRGanter+ where the prefix denotes the algorithm's lineage. To evaluate the factors that impact distributed algorithm performance, we compare our $\mbox{MR}^{*}$ algorithms with the state-of-the-art. Experiments conducted on real datasets demonstrate that MRGanter+ is efficient, scalable and an appealing algorithm for distributed problems.

[1]  Bernhard Ganter,et al.  Formal Concept Analysis: Mathematical Foundations , 1998 .

[2]  Rokia Missaoui,et al.  A partition-based approach towards constructing Galois (concept) lattices , 2002, Discret. Math..

[3]  Sergei O. Kuznetsov,et al.  Comparing performance of algorithms for generating concept lattices , 2002, J. Exp. Theor. Artif. Intell..

[4]  Rudolf Wille,et al.  Restructuring Lattice Theory: An Approach Based on Hierarchies of Concepts , 2009, ICFCA.

[5]  Géraldine Polaillon,et al.  FCA for contextual semantic navigation and information retrieval in heterogeneous information systems , 2007 .

[6]  C. Dowling On the irredundant generation of knowledge spaces , 1993 .

[7]  Rokia Missaoui,et al.  INCREMENTAL CONCEPT FORMATION ALGORITHMS BASED ON GALOIS (CONCEPT) LATTICES , 1995, Comput. Intell..

[8]  Gerd Stumme,et al.  Efficient Mining of Association Rules Based on Formal Concept Analysis , 2005, Formal Concept Analysis.

[9]  Václav Snásel,et al.  Analyzing Social Networks Using FCA: Complexity Aspects , 2009, 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology.

[10]  Bernhard Ganter,et al.  Two Basic Algorithms in Concept Analysis , 2010, ICFCA.

[11]  Christian Lindig Fast Concept Analysis , 2000 .

[12]  Bernard Monjardet,et al.  The Lattices of Closure Systems, Closure Operators, and Implicational Systems on a Finite Set: A Survey , 2003, Discret. Appl. Math..

[13]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[14]  Vilem Vychodil A New Algorithm for Computing Formal Concepts , 2007 .

[15]  Jan Outrata,et al.  Parallel Recursive Algorithm for FCA , 2008 .

[16]  Xu Qian,et al.  An Improved Incremental Algorithm for Constructing Concept Lattices , 2009, 2009 WRI World Congress on Software Engineering.

[17]  J. Bordat Calcul pratique du treillis de Galois d'une correspondance , 1986 .

[18]  Owen Molloy,et al.  Using Description Logic and Rules to Determine XML Access Control , 2007 .

[19]  Vilém Vychodil,et al.  Distributed Algorithm for Computing Formal Concepts Using Map-Reduce Framework , 2009, IDA.

[20]  Anne Berry,et al.  A local approach to concept generation , 2007, Annals of Mathematics and Artificial Intelligence.

[21]  Simon Andrews,et al.  In-Close, a fast algorithm for computing formal concepts , 2009 .

[22]  Claudio Carpineto,et al.  A lattice conceptual clustering system and its application to browsing retrieval , 2004, Machine Learning.

[23]  Kunle Olukotun,et al.  Map-Reduce for Machine Learning on Multicore , 2006, NIPS.

[24]  Geoffrey C. Fox,et al.  Twister: a runtime for iterative MapReduce , 2010, HPDC '10.