From variable weighting to cluster characterization in topographic unsupervised learning

We introduce a new learning approach, which provides simultaneously Self-Organizing Map (SOM) and local weight vector for each cluster. The proposed approach is computationally simple, and learns a different features vector weights for each cell (relevance vector). Based on the Self-Organizing Map approach, we present two new simultaneously clustering and weighting algorithms: local weighting observation lwo-SOM and local weighting distance lwd-SOM. Both algorithms achieve the same goal by minimizing different cost functions. After learning phase, a selection method with weight vectors is used to prune the irrelevant variables and thus we can characterize the clusters. We illustrate the performance of the proposed approach using different data sets. A number of synthetic and real data are experimented on to show the benefits of the proposed local weighting using self-organizing models.

[1]  Esa Alhoniemi,et al.  Clustering of the self-organizing map , 2000, IEEE Trans. Neural Networks Learn. Syst..

[2]  Thomas G. Dietterich,et al.  Learning with Many Irrelevant Features , 1991, AAAI.

[3]  MÉZIANE YACOUB,et al.  Features Selection and Architecture Optimization in Connectionist Systems , 2000, Int. J. Neural Syst..

[4]  R. Cattell The Scree Test For The Number Of Factors. , 1966, Multivariate behavioral research.

[5]  Michael K. Ng,et al.  Automated variable weighting in k-means type clustering , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Jian Yu,et al.  A Novel Fuzzy C-Means Clustering Algorithm , 2006, RSKT.

[7]  Sébastien Gúerif,et al.  Dimensionality Reduction Through Unsupervised Features Selection , 2007 .

[8]  Kohji Fukunaga,et al.  Introduction to Statistical Pattern Recognition-Second Edition , 1990 .

[9]  Myung-Hoe Huh,et al.  Weighting variables in K-means clustering , 2009 .

[10]  Michael K. Ng,et al.  An Entropy Weighting k-Means Algorithm for Subspace Clustering of High-Dimensional Sparse Data , 2007, IEEE Transactions on Knowledge and Data Engineering.

[11]  Yunming Ye,et al.  Fuzzy K-Means with Variable Weighting in High Dimensional Data Analysis , 2008, 2008 The Ninth International Conference on Web-Age Information Management.

[12]  Younès Bennani Adaptive weighting of pattern features during learning , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[13]  Christopher M. Bishop,et al.  GTM: The Generative Topographic Mapping , 1998, Neural Computation.

[14]  Ben J. A. Kröse,et al.  Self-organizing mixture models , 2005, Neurocomputing.

[15]  Chieh-Yuan Tsai,et al.  Developing a feature weight self-adjustment mechanism for a K-means clustering algorithm , 2008, Comput. Stat. Data Anal..

[16]  M. V. Velzen,et al.  Self-organizing maps , 2007 .

[17]  Thomas G. Dietterich,et al.  Learning Boolean Concepts in the Presence of Many Irrelevant Features , 1994, Artif. Intell..

[18]  Stewart Massie,et al.  Unsupervised Feature Selection for Text Data , 2006, ECCBR.

[19]  Pierre Gançarski,et al.  MACLAW: A modular approach for clustering with local attribute weighting , 2006, Pattern Recognit. Lett..

[20]  Hichem Frigui,et al.  Unsupervised learning of prototypes and attribute weights , 2004, Pattern Recognit..

[21]  Carla E. Brodley,et al.  Feature Selection for Unsupervised Learning , 2004, J. Mach. Learn. Res..

[22]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[23]  K. Benabdeslem,et al.  Feature Selection for Self-Organizing Map , 2007, 2007 29th International Conference on Information Technology Interfaces.

[24]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[25]  Mustapha Lebbah,et al.  BeSOM : Bernoulli on Self-Organizing Map , 2007, 2007 International Joint Conference on Neural Networks.

[26]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .