How to use the Kohonen algorithm to simultaneously analyze individuals and modalities in a survey

The Kohonen algorithm (SOM, Self-Organization and Associative Memory, Springer Series in Information Sciences, vol. 8, Springer, Berlin, 1984; Self-Organizing Maps, Springer Series in Information Science, vol. 30, Springer, Berlin, 1995) is a very powerful tool for data analysis. It was originally designed to model organized connections between some biological neural networks. It was also immediately considered as a very good algorithm to realize vectorial quantization, and at the same time pertinent classification, with nice properties for visualization. If the individuals are described by quantitative variables (ratios, frequencies, measurements, amounts, etc.), the straightforward application of the original algorithm leads to build code vectors and to associate to each of them the class of all the individuals which are more similar to this code-vector than to the others. But, in case of individuals described by categorical (qualitative) variables having a finite number of modalities (like in a survey), it is necessary to define a specific algorithm. In this paper, we present a new algorithm inspired by the SOM algorithm, which provides a simultaneous classification of the individuals and of their modalities.

[1]  Thomas Hofmann,et al.  Probabilistic latent semantic indexing , 1999, SIGIR '99.

[2]  Marie Cottrell,et al.  Working times in atypical forms of employment: the special case of part-time work , 2006, ArXiv.

[3]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[4]  Marie Cottrell,et al.  CLASSIFICATION OF RECURRING UNEMPLOYED WORKERS AND UNEMPLOYMENT EXITS , 2000 .

[5]  François Michon,et al.  Le temps de travail des formes particulières d'emploi , 2002 .

[6]  Franciska de Jong,et al.  Generative Probabilistic Models , 2007, Multimedia Retrieval.

[7]  P. Duncombe,et al.  Multivariate Descriptive Statistical Analysis: Correspondence Analysis and Related Techniques for Large Matrices , 1985 .

[8]  Marie Cottrell,et al.  Multiple correspondence analysis of a crosstabulations matrix using the Kohonen algorithm , 1995, ESANN.

[9]  A. Morineau,et al.  Multivariate descriptive statistical analysis , 1984 .

[10]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[11]  C. Burt THE FACTORIAL ANALYSIS OF QUALITATIVE DATA , 1950 .

[12]  Tom Minka,et al.  Expectation-Propogation for the Generative Aspect Model , 2002, UAI.

[13]  Wray L. Buntine,et al.  Is Multinomial PCA Multi-faceted Clustering or Dimensionality Reduction? , 2003, AISTATS.

[14]  Patrick Rousset,et al.  Analyzing and Representing Multidimentional Quantitative an Qualitative Data: Demographic Study of the Rhone Valley. The Domestic Consumption of the Canadian Families , 1999 .

[15]  Teuvo Kohonen,et al.  Self-organization and associative memory: 3rd edition , 1989 .

[16]  Erkki Oja,et al.  Kohonen Maps , 1999, Encyclopedia of Machine Learning.

[17]  Marie Cottrell,et al.  Connectionist approaches in economics and management sciences , 2003 .

[18]  George M. Church,et al.  Biclustering of Expression Data , 2000, ISMB.

[19]  A. D. Gordon,et al.  Correspondence Analysis Handbook. , 1993 .

[20]  Marie Cottrell,et al.  Analyzing and representing multidimensional quantitative and qualitative data , 1999 .