Evolutionary multi-objective distance metric learning for multi-label clustering

In data mining and machine learning, the definition of the distance between two data points substantially affects clustering and classification tasks. We propose a distance metric learning (DML) method for multi-label clustering, that uses evolutionary multi-objective optimization and a cluster validity measure with a neighbor relation that simultaneously evaluates inter- and intra-clusters. The proposed method produces clustering results considering multiple class labels and allows the induction of knowledge regarding relations between class labels in multi-label clustering or between objective functions and elements in transform matrix. Experimental results have shown that the proposed DML method produces better transform matrices than single-objective optimization and is helpful in finding the attributes that affect the trade-off relationship among objective functions.

[1]  Thomas Seidl,et al.  An effective evaluation measure for clustering on evolving data streams , 2011, KDD.

[2]  Min-Ling Zhang,et al.  Ml-rbf: RBF Neural Networks for Multi-Label Learning , 2009, Neural Processing Letters.

[3]  Csaba Legány,et al.  Cluster validity measurement techniques , 2006 .

[4]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[5]  Masayuki Numao,et al.  Neighborhood-Based Smoothing of External Cluster Validity Measures , 2012, PAKDD.

[6]  Frédéric Alexandre,et al.  Self-organizing Map Initialization , 2005, ICANN.

[7]  Alex Alves Freitas,et al.  A new ant colony algorithm for multi-label classification with applications in bioinfomatics , 2006, GECCO.

[8]  Amanda Clare,et al.  Knowledge Discovery in Multi-label Phenotype Data , 2001, PKDD.

[9]  Julio Gonzalo,et al.  A comparison of extrinsic clustering evaluation metrics based on formal constraints , 2008, Information Retrieval.

[10]  Tomer Hertz,et al.  Learning Distance Functions using Equivalence Relations , 2003, ICML.

[11]  Yi Liu,et al.  An Efficient Algorithm for Local Distance Metric Learning , 2006, AAAI.

[12]  Grigorios Tsoumakas,et al.  Mining Multi-label Data , 2010, Data Mining and Knowledge Discovery Handbook.

[13]  Anil K. Jain,et al.  Multiobjective data clustering , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[14]  Ajith Abraham,et al.  Multi-Objective Differential Evolution for Automatic Clustering with Application to Micro-Array Data Analysis , 2009, Sensors.

[15]  Tomer Hertz,et al.  Learning a Mahalanobis Metric from Equivalence Constraints , 2005, J. Mach. Learn. Res..

[16]  Janez Brest,et al.  Self-Adapting Control Parameters in Differential Evolution: A Comparative Study on Numerical Benchmark Problems , 2006, IEEE Transactions on Evolutionary Computation.

[17]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[18]  Geoffrey E. Hinton,et al.  Neighbourhood Components Analysis , 2004, NIPS.

[19]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[20]  Dacheng Tao,et al.  Learning a Distance Metric by Empirical Loss Minimization , 2011, IJCAI.

[21]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[22]  Samuel Kaski,et al.  Principle of Learning Metrics for Exploratory Data Analysis , 2004, J. VLSI Signal Process..

[23]  Mario Köppen,et al.  Data Swarm Clustering , 2006, Swarm Intelligence in Data Mining.

[24]  Ian Davidson,et al.  Measuring Constraint-Set Utility for Partitional Clustering Algorithms , 2006, PKDD.

[25]  Rong Jin,et al.  Distance Metric Learning: A Comprehensive Survey , 2006 .

[26]  Zhijian Wu,et al.  Parallel differential evolution with self-adapting control parameters and generalized opposition-based learning for solving high-dimensional optimization problems , 2013, J. Parallel Distributed Comput..

[27]  Jason Weston,et al.  A kernel method for multi-labelled classification , 2001, NIPS.

[28]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[29]  Eréndira Rendón,et al.  Internal versus External cluster validation indexes , 2011 .

[30]  Wei Liu,et al.  Learning Distance Metrics with Contextual Constraints for Image Retrieval , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[31]  Masayuki Numao,et al.  Evolutionary Distance Metric Learning Approach to Semi-supervised Clustering with Neighbor Relations , 2013, 2013 IEEE 25th International Conference on Tools with Artificial Intelligence.

[32]  Amir Globerson,et al.  Metric Learning by Collapsing Classes , 2005, NIPS.

[33]  Sriparna Saha,et al.  A generalized automatic clustering algorithm in a multiobjective framework , 2013, Appl. Soft Comput..

[34]  Tomer Hertz,et al.  Boosting margin based distance functions for clustering , 2004, ICML.

[35]  Min Meng,et al.  Unsupervised co-segmentation for 3D shapes using iterative multi-label optimization , 2013, Comput. Aided Des..