A Novel Distance Measure for Interval Data

Interval data is attracting attention from the data analysi s community due to its ability to describe complex concepts. Since clust ering is an important data analysis tool, extending these techniques to interval dat is important. Applying traditional clustering methods on interval data los es information inherited in this particular data type. This paper proposes a novel dis sim larity measure which explores the internal structure of intervals in a prob abilistic manner based on domain knowledge. Our experiments show that interval clu stering based on the proposed dissimilarity measure produces meaningful re s lts.

[1]  Hans-Hermann Bock,et al.  Analysis of Symbolic Data: Exploratory Methods for Extracting Statistical Information from Complex Data , 2000 .

[2]  D. S. Guru,et al.  Multivalued type dissimilarity measure and concept of mutual dissimilarity value for clustering symbolic patterns , 2005, Pattern Recognit..

[3]  Manabu Ichino,et al.  Generalized Minkowski metrics for mixed feature-type data analysis , 1994, IEEE Trans. Syst. Man Cybern..

[4]  M. Narasimha Murty,et al.  Rough set based incremental clustering of interval data , 2006, Pattern Recognit. Lett..

[5]  Hans-Hermann Bock,et al.  Dynamic clustering for interval data based on L2 distance , 2006, Comput. Stat..

[6]  George Karypis,et al.  Empirical and Theoretical Comparisons of Selected Criterion Functions for Document Clustering , 2004, Machine Learning.

[7]  Antonio Irpino,et al.  Clustering reduced interval data using Hausdorff distance , 2006, Comput. Stat..

[8]  Francisco de A. T. de Carvalho,et al.  Clustering of interval data based on city-block distances , 2004, Pattern Recognit. Lett..

[9]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[10]  Wei Peng,et al.  Interval Data Clustering with Applications , 2006, 2006 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'06).

[11]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[12]  Yves Lechevallier,et al.  Adaptive Hausdorff distances and dynamic clustering of symbolic interval data , 2006, Pattern Recognit. Lett..

[13]  R.M.C.R. de Souza,et al.  Dynamic clustering of interval data based on adaptive Chebyshev distances , 2004 .