Unsupervised pattern recognition models for mixed feature-type symbolic data

Unsupervised pattern recognition methods for mixed feature-type symbolic data based on dynamical clustering methodology with adaptive distances are presented. These distances change at each algorithm's iteration and can either be the same for all clusters or different from one cluster to another. Moreover, the methods need a previous pre-processing step in order to obtain a suitable homogenization of the mixed feature-type symbolic data into histogram-valued symbolic data. The presented dynamic clustering algorithms have then as input a set of vectors of histogram-valued symbolic data and they furnish a partition and a prototype to each cluster by optimizing an adequacy criterion based on suitable adaptive squared Euclidean distances. To show the usefulness of these methods, examples with synthetic symbolic data sets as well as applications with real symbolic data sets are considered. Moreover, various tools suitable for interpreting the partition and the clusters given by these algorithms are also presented.

[1]  M. Chavent,et al.  Trois nouvelles méthodes de classification automatique de données symboliques de type intervalle , 2003 .

[2]  Otto Optiz,et al.  Conceptual and Numerical Analysis of Data , 1989 .

[3]  K. Chidananda Gowda,et al.  Agglomerative clustering of symbolic objects using the concepts of both similarity and dissimilarity , 1995, Pattern Recognit. Lett..

[4]  Manabu Ichino,et al.  Generalized Minkowski metrics for mixed feature-type data analysis , 1994, IEEE Trans. Syst. Man Cybern..

[5]  K. Chidananda Gowda,et al.  Divisive clustering of symbolic objects using the concepts of both similarity and dissimilarity , 1995, Pattern Recognit..

[6]  Hans-Hermann Bock CLUSTERING ALGORITHMS AND KOHONEN MAPS FOR SYMBOLIC DATA(Symbolic Data Analysis) , 2003 .

[7]  Allan D. Gordon,et al.  An Iterative Relocation Algorithm for Classifying Symbolic Data , 2000 .

[8]  Hans-Hermann Bock,et al.  Analysis of Symbolic Data: Exploratory Methods for Extracting Statistical Information from Complex Data , 2000 .

[9]  K. Chidananda Gowda,et al.  Clustering of symbolic objects using gravitational approach , 1999, IEEE Trans. Syst. Man Cybern. Part B.

[10]  Edwin Diday,et al.  Symbolic Cluster Analysis , 1989 .

[11]  Yves Lechevallier,et al.  Partitional clustering algorithms for symbolic interval data based on single adaptive distances , 2009, Pattern Recognit..

[12]  Edwin Diday,et al.  Symbolic Data Analysis: Conceptual Statistics and Data Mining (Wiley Series in Computational Statistics) , 2007 .

[13]  F. A. T. de Carvalho Histograms in symbolic data analysis , 1995, Ann. Oper. Res..

[14]  P. Nagabhushan,et al.  Multivalued type proximity measure and concept of mutual similarity value useful for clustering symbolic patterns , 2004, Pattern Recognit. Lett..

[15]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[16]  Hans-Hermann Bock 6. Symbolic Data Analysis , 2003 .

[17]  J. C. Simon,et al.  3. Clustering Analysis , 1976 .

[18]  Francisco de A. T. de Carvalho,et al.  Clustering of interval data based on city-block distances , 2004, Pattern Recognit. Lett..

[19]  Monique Noirhomme-Fraiture,et al.  Symbolic Data Analysis and the SODAS Software , 2008 .

[20]  Edwin Diday,et al.  Symbolic clustering using a new dissimilarity measure , 1991, Pattern Recognit..

[21]  Yves Lechevallier,et al.  Adaptative Hausdorff Distances and Dynamic Clustering of Symbolic Interval Data , 2017 .

[22]  H. Ralambondrainy,et al.  A conceptual version of the K-means algorithm , 1995, Pattern Recognit. Lett..

[23]  Yves Lechevallier,et al.  Dynamical Clustering of Interval Data: Optimization of an Adequacy Criterion Based on Hausdorff Distance , 2002 .

[24]  Martin Schader,et al.  Data Analysis: Scientific Modeling And Practical Application , 2000 .

[25]  Hans-Hermann Bock,et al.  Dynamic clustering for interval data based on L2 distance , 2006, Comput. Stat..

[26]  D. S. Guru,et al.  Multivalued type dissimilarity measure and concept of mutual dissimilarity value for clustering symbolic patterns , 2005, Pattern Recognit..

[27]  Partitional clustering algorithms for symbolic interval data based on single adaptive distances , 2009 .

[28]  L. Hubert,et al.  Comparing partitions , 1985 .