Reducing UK-Means to K-Means

This paper proposes an optimisation to the UK-means algorithm, which generalises the k-means algorithm to han- dle objects whose locations are uncertain. The location of each object is described by a probability density function (pdf). The UK-means algorithm needs to compute expected distances (EDs) between each object and the cluster repre- sentatives. The evaluation of ED from first principles is very costly operation, because the pdf 's are different and arbi- trary. But UK-means needs to evaluate a lot of EDs. This is a major performance burden of the algorithm. In this pa- per, we derive a formula for evaluating EDs efficiently. This tremendously reduces the execution time of UK-means, as demonstrated by our preliminary experiments. We also il- lustrate that this optimised formula effectively reduces the UK-means problem to the traditional clustering algorithm addressed by the k-means algorithm.

[1]  R. Douglas Gregory Classical mechanics : an undergraduate text , 2006 .

[2]  Hans-Peter Kriegel,et al.  Density-based clustering of uncertain data , 2005, KDD '05.

[3]  Hans-Peter Kriegel,et al.  Hierarchical density-based clustering of uncertain data , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[4]  Dieter Pfoser,et al.  Capturing the Uncertainty of Moving-Object Representations , 1999, SSD.

[5]  Reynold Cheng,et al.  Efficient Clustering of Uncertain Data , 2006, Sixth International Conference on Data Mining (ICDM'06).

[6]  Reynold Cheng,et al.  Uncertain Data Mining: An Example in Clustering Location Data , 2006, PAKDD.