SOM-ELM - Self-Organized Clustering using ELM

This paper presents two new clustering techniques based on Extreme Learning Machine (ELM). These clustering techniques can incorporate a priori knowledge (of an expert) to define the optimal structure for the clusters, i.e. the number of points in each cluster. Using ELM, the first proposed clustering problem formulation can be rewritten as a Traveling Salesman Problem and solved by a heuristic optimization method. The second proposed clustering problem formulation includes both a priori knowledge and a self-organization based on a predefined map (or string). The clustering methods are successfully tested on 5 toy examples and 2 real datasets.

[1]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[2]  Robin Sibson,et al.  SLINK: An Optimally Efficient Algorithm for the Single-Link Cluster Method , 1973, Comput. J..

[3]  Kaj-Mikael Björk,et al.  Solving large-scale retrofit heat exchanger network synthesis problems with mathematical optimization methods , 2005 .

[4]  Melody Y. Kiang,et al.  Extending the Kohonen self-organizing map networks for clustering analysis , 2002 .

[5]  Kenneth Steiglitz,et al.  Combinatorial Optimization: Algorithms and Complexity , 1981 .

[6]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[7]  Amaury Lendasse,et al.  TROP-ELM: A double-regularized ELM using LARS and Tikhonov regularization , 2011, Neurocomputing.

[8]  Cheng Wu,et al.  Semi-Supervised and Unsupervised Extreme Learning Machines , 2014, IEEE Transactions on Cybernetics.

[9]  Dilin Wang,et al.  Parallel Construction of Approximate kNN Graph , 2012, 2012 11th International Symposium on Distributed Computing and Applications to Business, Engineering & Science.

[10]  Marie Cottrell,et al.  Analysing a Contingency Table with Kohonen Maps: A Factorial Correspondence Analysis , 1993, IWANN.

[11]  David E. Goldberg,et al.  AllelesLociand the Traveling Salesman Problem , 1985, ICGA.

[12]  Vladimir Estivill-Castro,et al.  Why so many clustering algorithms: a position paper , 2002, SKDD.

[13]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[14]  Yann Boniface,et al.  Dynamic self-organising map , 2011, Neurocomputing.

[15]  Iván Machón González,et al.  Self-organizing map and clustering for wastewater treatment monitoring , 2004, Eng. Appl. Artif. Intell..

[16]  Estivill-CastroVladimir Why so many clustering algorithms , 2002 .

[17]  Calyampudi R. Rao,et al.  Generalized inverse of a matrix and its applications , 1972 .

[18]  Victor C. M. Leung,et al.  Extreme Learning Machines [Trends & Controversies] , 2013, IEEE Intelligent Systems.

[19]  Amaury Lendasse,et al.  Sparse Linear Combination of SOMs for Data Imputation: Application to Financial Database , 2009, WSOM.

[20]  David E. Goldberg,et al.  Alleles, loci and the traveling salesman problem , 1985 .

[21]  Tommy W. S. Chow,et al.  Clustering of the self-organizing map using a clustering validity index based on inter-cluster and intra-cluster density , 2004, Pattern Recognit..

[22]  Teuvo Kohonen,et al.  Self-organization and associative memory: 3rd edition , 1989 .

[23]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[24]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[25]  Amaury Lendasse,et al.  Mixture of Gaussians for distance estimation with missing data , 2014, Neurocomputing.

[26]  R. Cattell The description of personality: basic traits resolved into clusters. , 1943 .

[27]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[28]  Erkki Oja,et al.  GPU-accelerated and parallelized ELM ensembles for large-scale regression , 2011, Neurocomputing.

[29]  Christophe Biernacki,et al.  Simultaneous Gaussian model-based clustering for samples of multiple origins , 2013, Comput. Stat..

[30]  Keld Helsgaun,et al.  General k-opt submoves for the Lin–Kernighan TSP heuristic , 2009, Math. Program. Comput..

[31]  Teuvo Kohonen,et al.  Self-Organization and Associative Memory , 1988 .

[32]  Dan Simon,et al.  Analytical and numerical comparisons of biogeography-based optimization and genetic algorithms , 2011, Inf. Sci..

[33]  Qinyu. Zhu Extreme Learning Machine , 2013 .

[34]  Eric Séverin,et al.  Self organizing maps in corporate finance: Quantitative and qualitative analysis of debt and leasing , 2010, Neurocomputing.

[35]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[36]  Hongming Zhou,et al.  Extreme Learning Machines [Trends & Controversies] , 2013 .

[37]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[38]  Chee Kheong Siew,et al.  Universal Approximation using Incremental Constructive Feedforward Networks with Random Hidden Nodes , 2006, IEEE Transactions on Neural Networks.

[39]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[40]  Amaury Lendasse,et al.  OP-ELM: Optimally Pruned Extreme Learning Machine , 2010, IEEE Transactions on Neural Networks.

[41]  Héctor Pomares,et al.  Fast Feature Selection in a GPU Cluster Using the Delta Test , 2014, Entropy.