Clustering of Gaussian distributions

Clustering plays a basic role in many areas of data engineering, pattern recognition and image analysis. Gaussian Mixture Model (GMM) and Cross-Entropy Clustering (CEC) can approximate data of varied shapes by covering it with several clusters e.g. elliptical ones. However, it often happens that we need to extract clusters concentrated around lower dimensional non-linear manifolds. Moreover it is problematic to extract a cluster when data contains a big number of components. Here, we propose a method of solving the above problem by clustering density distribution. This approach allows to determine components of various sizes and geometry. Moreover, it is frequent in clustering problems that the data is naturally given by Gaussian distributions.

[1]  Jan-Olof Eklundh,et al.  Statistical background subtraction for a mobile observer , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[2]  S. C. Johnson Hierarchical clustering schemes , 1967, Psychometrika.

[3]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[4]  Anil K. Jain,et al.  Simultaneous feature selection and clustering using mixture models , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Frank Nielsen,et al.  Sided and Symmetrized Bregman Centroids , 2009, IEEE Transactions on Information Theory.

[6]  Ziyou Xiong,et al.  Improved information maximization based face and facial feature detection from real-time video and application in a multi-modal person identification system , 2002, Proceedings. Fourth IEEE International Conference on Multimodal Interfaces.

[7]  Richard J. Povinelli,et al.  Time series classification using Gaussian mixture models of reconstructed phase spaces , 2004, IEEE Transactions on Knowledge and Data Engineering.

[8]  Olivier Rouaud,et al.  Memory center , 2018 .

[9]  Jitendra Malik,et al.  A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[10]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[11]  Piotr Indyk A sublinear time approximation scheme for clustering in metric spaces , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[12]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[13]  Hayit Greenspan,et al.  Simplifying Mixture Models Using the Unscented Transform , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  H. Jeffreys An invariant form for the prior probability in estimation problems , 1946, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.

[15]  J Tabor,et al.  Cross-entropy clustering , 2012, Pattern Recognit..

[16]  Martial Hebert,et al.  Man-made structure detection in natural images using a causal multiscale random field , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[17]  Jacek Tabor,et al.  A Local Gaussian Filter and Adaptive Morphology as Tools for Completing Partially Discontinuous Curves , 2013, CISIM.

[18]  James T. Kwok,et al.  Simplifying Mixture Models Through Function Approximation , 2006, IEEE Transactions on Neural Networks.

[19]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[20]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[21]  Jacek Tabor,et al.  Spherical wards clustering and generalized Voronoi diagrams , 2015, 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA).

[22]  Deniz Erdogmus,et al.  Information Theoretic Learning , 2005, Encyclopedia of Artificial Intelligence.

[23]  Marco Wiering,et al.  2011 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) , 2011, IJCNN 2011.

[24]  Jacek Tabor,et al.  Entropy Approximation in Lossy Source Coding Problem , 2015, Entropy.

[25]  V. Batagelj Generalized Ward and Related Clustering Problems ∗ , 1988 .

[26]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[27]  Frank Nielsen,et al.  Simplifying Gaussian mixture models via entropic quantization , 2009, 2009 17th European Signal Processing Conference.

[28]  Alex Pentland,et al.  Probabilistic Visual Learning for Object Representation , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  P. Deb Finite Mixture Models , 2008 .

[30]  Kris Popat,et al.  Cluster-based probability model and its application to image and texture processing , 1997, IEEE Trans. Image Process..

[31]  Inderjit S. Dhillon,et al.  Differential Entropic Clustering of Multivariate Gaussians , 2006, NIPS.

[32]  Jonas Samuelsson,et al.  Waveform quantization of speech using Gaussian mixture models , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[33]  Vincent Kanade,et al.  Clustering Algorithms , 2021, Wireless RF Energy Transfer in the Massive IoT Era.

[34]  David E. Booth,et al.  Applied Multivariate Analysis , 2003, Technometrics.

[35]  Adrian E. Raftery,et al.  How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis , 1998, Comput. J..

[36]  Shaogang Gong,et al.  Tracking colour objects using adaptive mixture models , 1999, Image Vis. Comput..

[37]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[38]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.