CECM: Adding pairwise constraints to evidential clustering

Fuzzy or hard partitioning methods aim at grouping objects according to their similarity. Recently, a new concept of partition based on belief function theory, called credal partition, has been proposed and has been shown to generate meaningful description of the data. Hard, fuzzy or credal partitions are generally obtained using unsupervised learning methods, using only the numeric description between two objects to compute their similarity. However, in some applications, some kind of background knowledge about the objects or about the clusters is available. To integrate this auxiliary information, constraint-based (or semi-supervised) methods have been proposed. A popular type of constraints specifies whether two objects are in the same cluster (must-link) or in different clusters (cannot-link). We propose here a new algorithm, called CECM, which computes a credal partition using a constrained clustering method. We show how to translate the available information into constraints, and how to integrate them in the search of the credal partition. The paper ends with some experimental results. Results of CECM are compared to other constrained clustering algorithms. Then an application in image segmentation is described.

[1]  Raymond J. Mooney,et al.  Integrating constraints and metric learning in semi-supervised clustering , 2004, ICML.

[2]  Kiri Wagstaff,et al.  Value, Cost, and Sharing: Open Issues in Constrained Clustering , 2006, KDID.

[3]  Joydeep Ghosh,et al.  Scalable, Balanced Model-based Clustering , 2003, SDM.

[4]  Yinyu Ye,et al.  An extension of Karmarkar's projective algorithm for convex quadratic programming , 1989, Math. Program..

[5]  S. S. Ravi,et al.  Clustering with Constraints: Feasibility Issues and the k-Means Algorithm , 2005, SDM.

[6]  Thierry Denoeux,et al.  ECM: An evidential version of the fuzzy c , 2008, Pattern Recognit..

[7]  Ronald R. Yager,et al.  On the normalization of fuzzy belief structures , 1996, Int. J. Approx. Reason..

[8]  Donald Gustafson,et al.  Fuzzy clustering with a fuzzy covariance matrix , 1978, 1978 IEEE Conference on Decision and Control including the 17th Symposium on Adaptive Processes.

[9]  Nozha Boujemaa,et al.  Active semi-supervised fuzzy clustering , 2008, Pattern Recognit..

[10]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[11]  S. Sen,et al.  Clustering of relational data containing noise and outliers , 1998, 1998 IEEE International Conference on Fuzzy Systems Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36228).

[12]  Raymond J. Mooney,et al.  A probabilistic framework for semi-supervised clustering , 2004, KDD.

[13]  Arindam Banerjee,et al.  Active Semi-Supervision for Pairwise Constrained Clustering , 2004, SDM.

[14]  V. J. Rayward-Smith,et al.  Fuzzy Cluster Analysis: Methods for Classification, Data Analysis and Image Recognition , 1999 .

[15]  Alessandro Saffiotti,et al.  The Transferable Belief Model , 1991, ECSQARU.

[16]  Dan Klein,et al.  From Instance-level Constraints to Space-Level Constraints: Making the Most of Prior Knowledge in Data Clustering , 2002, ICML.

[17]  Philippe Smets,et al.  The Transferable Belief Model for Quantified Belief Representation , 1998 .

[18]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[19]  Rajesh N. Davé,et al.  Characterization and detection of noise in clustering , 1991, Pattern Recognit. Lett..

[20]  S. S. Ravi,et al.  Intractability and clustering with constraints , 2007, ICML '07.

[21]  Yi Liu,et al.  BoostCluster: boosting clustering by pairwise constraints , 2007, KDD '07.

[22]  Thierry Denoeux,et al.  RECM: Relational evidential c-means algorithm , 2009, Pattern Recognit. Lett..

[23]  James M. Keller,et al.  A possibilistic approach to clustering , 1993, IEEE Trans. Fuzzy Syst..

[24]  Glenn Shafer,et al.  A Mathematical Theory of Evidence , 2020, A Mathematical Theory of Evidence.

[25]  Claire Cardie,et al.  Proceedings of the Eighteenth International Conference on Machine Learning, 2001, p. 577–584. Constrained K-means Clustering with Background Knowledge , 2022 .

[26]  Joydeep Ghosh,et al.  Model-based clustering with soft balancing , 2003, Third IEEE International Conference on Data Mining.

[27]  Thomas Hofmann,et al.  Non-redundant data clustering , 2006, Knowledge and Information Systems.

[28]  Thierry Denoeux,et al.  EVCLUS: evidential clustering of proximity data , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).