Classification non Supervisée de Données Multidimensionnelles par les Processus Ponctuels Marqués

Cet article decrit un nouvel algorithme non supervise de classification des donnees multidimensionnelles. Il consiste a detecter les prototypes des classes presentes dans un echantillon et a appliquer l’algorithme KNN pour la classification de toutes les observations. La detection des prototypes des classes est basee sur les processus ponctuels marques, c’est d’une part une adaptation de la methode de Metropolis-Hasting-Green qui genere des mouvements manipulant les objets du processus (naissance, mort…) et d’autre part une modelisation de Gibbs qui introduit la fonction de potentiel materialisant les interactions du processus en termes d’energie. Plusieurs experimentations ont ete realisees sur des donnees ponctuelles multidimensionnelles ou les classes sont non lineairement separables et des donnees reelles issues des puces a ADN. Une comparaison avec des methodes de classification existantes a permis de montrer l’efficacite de ce nouvel algorithme.

[1]  Yannis Manolopoulos,et al.  Adaptive k-Nearest-Neighbor Classification Using a Dynamic Number of Nearest Neighbors , 2007, ADBIS.

[2]  Peter Clifford,et al.  Markov Random Fields in Statistics , 2012 .

[3]  Gérard Govaert,et al.  Assessing a Mixture Model for Clustering with the Integrated Completed Likelihood , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  A. Baddeley,et al.  On connected component Markov point processes , 1999, Advances in Applied Probability.

[5]  R. S. Stoica,et al.  Filaments in observed and mock galaxy catalogues , 2009, 0912.2021.

[6]  Vance Faber,et al.  Clustering and the continuous k-means algorithm , 1994 .

[7]  Jack-Gérard Postaire,et al.  An Approximate Solution to Normal Mixture Identification with Application to Unsupervised Pattern Classification , 1981, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  André Kretzschmar,et al.  Cluster Pattern Detection in Spatial Data Based on Monte Carlo Inference , 2007, Biometrical journal. Biometrische Zeitschrift.

[9]  Jack-Gérard Postaire,et al.  A Markov random field model for mode detection in cluster analysis , 2008, Pattern Recognit. Lett..

[10]  Olivier Alata,et al.  Grouping/degrouping point process, a point process driven by geometrical and topological properties of a partition in regions , 2011, Comput. Vis. Image Underst..

[11]  Adrian Baddeley,et al.  Markov interacting component processes , 2000, Advances in Applied Probability.

[12]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[13]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[14]  Ran He,et al.  Agglomerative Mean-Shift Clustering , 2012, IEEE Transactions on Knowledge and Data Engineering.

[15]  Josiane Zerubia,et al.  A Marked Point Process Model Including Strong Prior Shape Information Applied to Multiple Object Extraction From Images , 2011, Int. J. Comput. Vis. Image Process..

[16]  Guangliang Chen,et al.  Spectral clustering based on local linear approximations , 2010, 1001.1323.

[17]  Hans-Peter Kriegel,et al.  Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications , 1998, Data Mining and Knowledge Discovery.