Deep embedded self-organizing maps for joint representation learning and topology-preserving clustering

A recent research area in unsupervised learning is the combination of representation learning with deep neural networks and data clustering. The success of deep learning for supervised tasks is widely established. However, recent research has demonstrated how neural networks are able to learn representations to improve clustering in their intermediate feature space, using specific regularizations. By considering representation learning and clustering as a joint task, models learn clustering-friendly spaces and outperform two-stage approaches where dimensionality reduction and clustering are performed separately. Recently, this idea has been extended to topology-preserving clustering models, known as self-organizing maps (SOM). This work is a thorough study on the deep embedded self-organizing map (DESOM), a model composed of an autoencoder and a SOM layer, training jointly the code vectors and network weights to learn SOM-friendly representations. In other words, SOM induces a form a regularization to improve the quality of quantization and topology in latent space. After detailing the architecture, loss and training algorithm, we study hyperparameters with a series of experiments. Different SOM-based models are evaluated in terms of clustering, visualization and classification on benchmark datasets. We study benefits and trade-offs of joint representation learning and self-organization. DESOM achieves competitive results, requires no pretraining and produces topologically organized visualizations.

[1]  Yannis Papanikolaou,et al.  Denoising Autoencoder Self-Organizing Map (DASOM) , 2018, Neural Networks.

[2]  Hanwei Wu,et al.  Vector Quantization-Based Regularization for Autoencoders , 2020, AAAI.

[3]  Teuvo Kohonen,et al.  Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.

[4]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[5]  Jianping Yin,et al.  Improved Deep Embedded Clustering with Local Structure Preservation , 2017, IJCAI.

[6]  Huachun Tan,et al.  Variational Deep Embedding: An Unsupervised and Generative Approach to Clustering , 2016, IJCAI.

[7]  Mustapha Lebbah,et al.  Large-scale Vibration Monitoring of Aircraft Engines from Operational Data using Self-organized Models , 2020 .

[8]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[9]  T. Kohonen Self-organized formation of topographically correct feature maps , 1982 .

[10]  Sung Wook Baik,et al.  Deep Learning Assisted Buildings Energy Consumption Profiling Using Smart Meter Data , 2020, Sensors.

[11]  Teuvo Kohonen,et al.  The self-organizing map , 1990 .

[12]  Yujia Lu,et al.  Visualization Analysis of Seismic Facies Based on Deep Embedded SOM , 2021, IEEE Geoscience and Remote Sensing Letters.

[13]  Qiang Liu,et al.  A Survey of Clustering With Deep Learning: From the Perspective of Network Architecture , 2018, IEEE Access.

[14]  Samuel Kaski,et al.  Comparing Self-Organizing Maps , 1996, ICANN.

[15]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[16]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[17]  Feng Liu,et al.  Deep auto-encoder based clustering , 2014, Intell. Data Anal..

[18]  Michel Verleysen,et al.  Aircraft Engine Fleet Monitoring Using Self-Organizing Maps and Edit Distance , 2011, WSOM.

[19]  Leland McInnes,et al.  UMAP: Uniform Manifold Approximation and Projection , 2018, J. Open Source Softw..

[20]  Gunnar Rätsch,et al.  T-DPSOM: an interpretable clustering method for unsupervised learning of patient health states , 2021, CHIL.

[21]  Roberto Carniel,et al.  Analysis of phreatic events at Ruapehu volcano, New Zealand using a new SOM approach , 2013 .

[22]  Mustapha Lebbah,et al.  Deep Architectures for Joint Clustering and Visualization with Self-organizing Maps , 2019, PAKDD.

[23]  En Zhu,et al.  Deep Clustering with Convolutional Autoencoders , 2017, ICONIP.

[24]  Sreeram Kannan,et al.  ClusterGAN : Latent Space Clustering in Generative Adversarial Networks , 2018, AAAI.

[25]  Theo Geisel,et al.  A Topographic Product for the Optimization of Self-Organizing Feature Maps , 1991, NIPS.

[26]  Oliver Kramer,et al.  Self-Organizing Maps with Convolutional Layers , 2019, WSOM+.