Three Interrelated FCA Methods for Mining Biclusters of Similar Values on Columns

Biclustering numerical data tables consists in detecting particular and strong associations between both subsets of objects and attributes. Such biclusters are interesting since they model the data as local patterns. Whereas there exists several definitions of biclusters, depending on the constraints they should respect, we focus in this paper on biclusters of similar values on columns. There are several ad hoc methods for mining such biclusters in the literature. We focus here on two aspects: genericity and efficiency. We show that Formal Concept Analysis provides a mathematical framework to characterize them in several ways, but also to compute them with existing and efficient algorithms. The proposed methods, which rely on pattern structures and triadic concept analysis, are experimented and compared on two different datasets.

[1]  Sergei O. Kuznetsov,et al.  Galois Connections in Data Analysis: Contributions from the Soviet Era and Modern Russian Research , 2005, Formal Concept Analysis.

[2]  Arlindo L. Oliveira,et al.  Biclustering algorithms for biological data analysis: a survey , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[3]  Mohamed Nadif,et al.  A spectral algorithm for topographical Co-clustering , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[4]  Jean-François Boulicaut,et al.  Closed patterns meet n-ary relations , 2009, TKDD.

[5]  Derrick G. Kourie,et al.  AddIntent: A New Incremental Algorithm for Constructing Concept Lattices , 2004, ICFCA.

[6]  Rudolf Wille,et al.  A Triadic Approach to Formal Concept Analysis , 1995, ICCS.

[7]  Zina M. Ibrahim,et al.  Biological Knowledge Discovery Handbook: Preprocessing, Mining and Postprocessing of Biological Data , 2013 .

[8]  Andreas Hotho,et al.  TRIAS--An Algorithm for Mining Iceberg Tri-Lattices , 2006, Sixth International Conference on Data Mining (ICDM'06).

[9]  Andrea Califano,et al.  Analysis of Gene Expression Microarrays for Phenotype Classification , 2000, ISMB.

[10]  Bernhard Ganter,et al.  Formal Concept Analysis, 6th International Conference, ICFCA 2008, Montreal, Canada, February 25-28, 2008, Proceedings , 2008, International Conference on Formal Concept Analysis.

[11]  Fernando José Von Zuben,et al.  Enumerating all maximal biclusters in real-valued datasets , 2014, ArXiv.

[12]  Amedeo Napoli,et al.  Biclustering Numerical Data in Formal Concept Analysis , 2011, ICFCA.

[13]  Sergei O. Kuznetsov,et al.  Comparing performance of algorithms for generating concept lattices , 2002, J. Exp. Theor. Artif. Intell..

[14]  Amedeo Napoli,et al.  Biclustering meets triadic concept analysis , 2013, Annals of Mathematics and Artificial Intelligence.

[15]  Amedeo Napoli,et al.  Computing Similarity Dependencies with Pattern Structures , 2013, CLA.

[16]  Amedeo Napoli,et al.  Characterizing functional dependencies in formal concept analysis with pattern structures , 2014, Annals of Mathematics and Artificial Intelligence.

[17]  Luc De Raedt,et al.  Mining Bi-sets in Numerical Data , 2006, KDID.

[18]  Bernhard Ganter,et al.  Pattern Structures and Their Projections , 2001, ICCS.

[19]  Víctor Codocedo,et al.  Lattice-based biclustering using Partition Pattern Structures , 2014, ECAI.