Can triconcepts become triclusters?

Abstract Two novel approaches to triclustering of three-way binary data are proposed. Tricluster is defined as a dense subset of a ternary relation Y defined on sets of objects, attributes, and conditions, or, equivalently, as a dense submatrix of the adjacency matrix of the ternary relation Y. This definition is a scalable relaxation of the notion of triconcept in Triadic Concept Analysis, whereas each triconcept of the initial data-set is contained in a certain tricluster. This approach generalizes the one previously introduced for concept-based biclustering. We also propose a hierarchical spectral triclustering algorithm for mining dense submatrices of the adjacency matrix of the initial ternary relation Y. Finally, we describe some applications of the proposed techniques, compare proposed approaches and study their performance in a series of experiments with real data-sets.

[1]  Camille Roth,et al.  Approaches to the Selection of Relevant Concepts in the Case of Noisy Data , 2010, ICFCA.

[2]  Jonas Poelmans,et al.  Gaining Insight in Social Networks with Biclustering and Triclustering , 2012, BIR.

[3]  Radim Belohlávek,et al.  What is a Fuzzy Concept Lattice? II , 2011, RSFDGrC.

[4]  Pauli Miettinen,et al.  Boolean Tensor Factorizations , 2011, 2011 IEEE 11th International Conference on Data Mining.

[5]  Tie-Yan Liu,et al.  Consistent bipartite graph co-partitioning for star-structured high-order heterogeneous data co-clustering , 2005, KDD '05.

[6]  Pawan Lingras,et al.  Rough clustering , 2011, Wiley Interdiscip. Rev. Data Min. Knowl. Discov..

[7]  M. E. J. Newman,et al.  Power laws, Pareto distributions and Zipf's law , 2005 .

[8]  Myra Spiliopoulou,et al.  Spectral Clustering in Social-Tagging Systems , 2009, WISE.

[9]  Alan M. Frieze,et al.  Clustering in large graphs and matrices , 1999, SODA '99.

[10]  Jian Pei,et al.  CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets , 2000, ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.

[11]  Rudolf Wille,et al.  A Triadic Approach to Formal Concept Analysis , 1995, ICCS.

[12]  George Voutsadakis,et al.  Polyadic Concept Analysis , 2002, Order.

[13]  Inderjit S. Dhillon,et al.  Co-clustering documents and words using bipartite spectral graph partitioning , 2001, KDD '01.

[14]  Arlindo L. Oliveira,et al.  Biclustering algorithms for biological data analysis: a survey , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[15]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[16]  Santosh S. Vempala,et al.  On clusterings-good, bad and spectral , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[17]  Amedeo Napoli,et al.  Mining gene expression data with pattern structures in formal concept analysis , 2011, Inf. Sci..

[18]  Jean-François Boulicaut,et al.  Closed patterns meet n-ary relations , 2009, TKDD.

[19]  Boris Mirkin,et al.  Clustering For Data Mining: A Data Recovery Approach (Chapman & Hall/Crc Computer Science) , 2005 .

[20]  M. Fiedler Algebraic connectivity of graphs , 1973 .

[21]  Andreas Hotho,et al.  Analysis of the Publication Sharing Behaviour in BibSonomy , 2007, ICCS.

[22]  Sergei O. Kuznetsov,et al.  On stability of a formal concept , 2007, Annals of Mathematics and Artificial Intelligence.

[23]  Mykola Pechenizkiy,et al.  Diversity in search strategies for ensemble feature selection , 2005, Inf. Fusion.

[24]  Gene H. Golub,et al.  Matrix computations , 1983 .

[25]  Rudolf Wille,et al.  The Basic Theorem of triadic concept analysis , 1995 .

[26]  Jean-François Boulicaut,et al.  Mining a New Fault-Tolerant Pattern Type as an Alternative to Formal Concept Discovery , 2006, ICCS.

[27]  Vilém Vychodil,et al.  Factor Analysis of Incidence Data via Novel Decomposition of Matrices , 2009, ICFCA.

[28]  Gerd Stumme,et al.  Computing iceberg concept lattices with T , 2002, Data Knowl. Eng..

[29]  M. Newman Power laws, Pareto distributions and Zipf's law , 2005 .

[30]  Weizhe Zhang,et al.  Unsupervised Tag Sense Disambiguation in Folksonomies , 2010, J. Comput..

[31]  Vilém Vychodil,et al.  Factorizing Three-Way Binary Data with Triadic Formal Concepts , 2010, KES.

[32]  Sergei O. Kuznetsov,et al.  Mathematical aspects of concept analysis , 1996 .

[33]  Bernhard Ganter,et al.  Scale Coarsening as Feature Selection , 2008, ICFCA.

[34]  Bernhard Ganter,et al.  Formal Concept Analysis: Mathematical Foundations , 1998 .

[35]  Jonas Poelmans,et al.  Concept-Based Biclustering for Internet Advertisement , 2012, 2012 IEEE 12th International Conference on Data Mining Workshops.

[36]  Bernhard Ganter,et al.  Formal Concept Analysis , 2013 .

[37]  Santosh S. Vempala,et al.  On clusterings: Good, bad and spectral , 2004, JACM.

[38]  Jean-François Boulicaut,et al.  Data Peeler: Contraint-Based Closed Pattern Mining in n-ary Relations , 2008, SDM.

[39]  Andreas Hotho,et al.  TRIAS--An Algorithm for Mining Iceberg Tri-Lattices , 2006, Sixth International Conference on Data Mining (ICDM'06).

[40]  Jonas Poelmans,et al.  Semi-automated knowledge discovery: identifying and profiling human trafficking , 2012, Int. J. Gen. Syst..

[41]  J. Hartigan Direct Clustering of a Data Matrix , 1972 .

[42]  Marina Meila,et al.  A Comparison of Spectral Clustering Algorithms , 2003 .

[43]  Dominik Slezak,et al.  Ensembles of Bireducts: Towards Robust Classification and Simple Representation , 2011, FGIT.

[44]  Chris H. Q. Ding,et al.  A min-max cut algorithm for graph partitioning and data clustering , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[45]  Boris G. Mirkin,et al.  Approximate Bicluster and Tricluster Boxes in the Analysis of Binary Data , 2011, RSFDGrC.

[46]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[47]  Leonid Zhukov,et al.  From Triconcepts to Triclusters , 2011, RSFDGrC.

[48]  Radim Belohlávek,et al.  Lattices of Fixed Points of Fuzzy Galois Connections , 2001, Math. Log. Q..

[49]  Jonas Poelmans,et al.  FCA-Based Models and a Prototype Data Analysis System for Crowdsourcing Platforms , 2013, ICCS.

[50]  Mohammed J. Zaki,et al.  CHARM: An Efficient Algorithm for Closed Itemset Mining , 2002, SDM.

[51]  Nicolas Pasquier,et al.  Efficient Mining of Association Rules Using Closed Itemset Lattices , 1999, Inf. Syst..

[52]  Amedeo Napoli,et al.  Biclustering meets triadic concept analysis , 2013, Annals of Mathematics and Artificial Intelligence.

[53]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[54]  Boris Mirkin,et al.  Mathematical Classification and Clustering , 1996 .