Density and entanglement-based clustering of sequence data

Clustering is one of crucial tasks in data processing, and various techniques have been developed for some specific purposes. Sequence data consist of a sequence of data elements each of which has its own predecessor and successor. This paper addresses new clustering methods for sequence data which use the notion of data density and entanglement. The proposed clustering methods weigh the data points based on either their density or entanglement which later affects the location of cluster centroids. The clustering algorithms are extensions of k-means clustering and fuzzy k-means clustering algorithms. Some experiment results are presented to show the behavioral characteristics of the proposed algorithms.

[1]  Kyung Mi Lee,et al.  Statistical cluster validity indexes to consider cohesion and separation , 2012, 2012 International conference on Fuzzy Theory and Its Applications (iFUZZY2012).

[2]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[3]  Jee-Hyong Lee,et al.  Implementation of Ontology Based Context-Awareness Framework for Ubiquitous Environment , 2007, 2007 International Conference on Multimedia and Ubiquitous Engineering (MUE'07).

[4]  Ujjwal Maulik,et al.  Advanced Methods for Knowledge Discovery from Complex Data , 2005 .

[5]  Sunita Sarawagi,et al.  Sequence Data Mining , 2005 .

[6]  Kyung Mi Lee,et al.  Fuzzy Technique-based Identification of Close and Distant Clusters in Clustering , 2011, Int. J. Fuzzy Log. Intell. Syst..

[7]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[8]  Myung-Mook Han,et al.  A Systematic Approach to Improve Fuzzy C-Mean Method based on Genetic Algorithm , 2013, Int. J. Fuzzy Log. Intell. Syst..

[9]  K. Lee,et al.  Fuzzy set-based distant cluster identification , 2010 .

[10]  Kaoru Hirota,et al.  Automatic Switching of Clustering Methods based on Fuzzy Inference in Bibliographic Big Data Retrieval System , 2014, Int. J. Fuzzy Log. Intell. Syst..

[11]  T. Warren Liao,et al.  Clustering of time series data - a survey , 2005, Pattern Recognit..