Data Visualization and Analysis with Self-Organizing Maps in Learning Metrics

High-dimensional data can be visualized and analyzed with the Self-Organizing Map, a method for clustering data and visualizing it on a lower-dimensional display. Results depend on the (often Euclidean) distance measure of the data space. We introduce an improved metric that emphasizes important local directions by measuring changes in an auxiliary, interesting property of the data points, for example their class. A Self-Organizing Map is computed in the new metric and used for visualizing and clustering the data. The trained map represents directions of highest relevance for the property of interest. In data analysis it is especially beneficial that the importance of the original data variables throughout the data space can be assessed and visualized. We apply the method to analyze the bankruptcy risk of Finnish enterprises.

[1]  R. Kass,et al.  Geometrical Foundations of Asymptotic Inference: Kass/Geometrical , 1997 .

[2]  R. Kass,et al.  Geometrical Foundations of Asymptotic Inference , 1997 .

[3]  Ben Shneiderman,et al.  Readings in information visualization - using vision to think , 1999 .

[4]  C. R. Rao,et al.  Information and the Accuracy Attainable in the Estimation of Statistical Parameters , 1992 .

[5]  Samuel Kaski,et al.  Bibliography of Self-Organizing Map (SOM) Papers: 1981-1997 , 1998 .

[6]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[7]  T. Kohonen Self-organized formation of topographically correct feature maps , 1982 .

[8]  Brian D. Ripley,et al.  Pattern Recognition and Neural Networks , 1996 .

[9]  T. Kohonen Self-Organized Formation of Correct Feature Maps , 1982 .

[10]  J.C. Principe,et al.  A methodology for information theoretic feature extraction , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[11]  R. Tibshirani,et al.  Discriminant Analysis by Gaussian Mixtures , 1996 .

[12]  Shun-ichi Amari,et al.  Differential-geometrical methods in statistics , 1985 .

[13]  William M. Campbell,et al.  Mutual Information in Learning Feature Transformations , 2000, ICML.

[14]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[15]  David Haussler,et al.  Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[16]  Trevor Hastie,et al.  Flexible discriminant and mixture models , 2000 .

[17]  M. Murray,et al.  Differential Geometry and Statistics , 1993 .

[18]  K. Kiviluoto,et al.  Exploring Corporate Bankruptcy with Two-Level Self-Organizing Map , 1998 .

[19]  Erkki Oja,et al.  Neural and statistical classifiers-taxonomy and two case studies , 1997, IEEE Trans. Neural Networks.

[20]  Thomas Hofmann,et al.  Learning the Similarity of Documents: An Information-Geometric Approach to Document Retrieval and Categorization , 1999, NIPS.

[21]  Solomon Kullback,et al.  Information Theory and Statistics , 1960 .

[22]  Kimmo Kiviluoto,et al.  Predicting bankruptcies with the self-organizing map , 1998, Neurocomputing.

[23]  Shun-ichi Amari,et al.  Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.

[24]  Hans-Peter Kriegel,et al.  Visualization Techniques for Mining Large Databases: A Comparison , 1996, IEEE Trans. Knowl. Data Eng..