Improved Multi Threshold Birch Clustering Algorithm

BIRCH algorithm is a clustering algorithm suitable for very large data sets. In the algorithm, a CF-tree is built whose all entries in each leaf node must satisfy a uniform threshold T, and the CF-tree is rebuilt at each stage by different threshold. But using a single threshold cause many shortcomings in the birch algorithm, in this paper to propose a solution to this shortcoming by using multiple thresholds instead of a single threshold.

[1]  Sudipto Guha,et al.  CURE: an efficient clustering algorithm for large databases , 1998, SIGMOD '98.

[2]  Jiong Yang,et al.  STING: A Statistical Information Grid Approach to Spatial Data Mining , 1997, VLDB.

[3]  Laszlo Bednarik,et al.  Parameter optimization for BIRCH pre-clustering algorithm , 2011, 2011 IEEE 12th International Symposium on Computational Intelligence and Informatics (CINTI).

[4]  Hrishikesh D. Vinod Mathematica Integer Programming and the Theory of Grouping , 1969 .

[5]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[6]  M. Narasimha Murty,et al.  A computationally efficient technique for data-clustering , 1980, Pattern Recognit..

[7]  Dimitrios Gunopulos,et al.  Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[8]  Hans-Peter Kriegel,et al.  OPTICS: ordering points to identify the clustering structure , 1999, SIGMOD '99.

[9]  G Salton,et al.  Developments in Automatic Text Retrieval , 1991, Science.

[10]  William H. E. Day,et al.  COMPLEXITY THEORY: AN INTRODUCTION FOR PRACTITIONERS OF CLASSIFICATION , 1996 .

[11]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[12]  Shiwei Tang,et al.  A New Fast Clustering Algorithm Based on Reference and Density , 2003, WAIM.

[13]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[14]  Aidong Zhang,et al.  WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases , 1998, VLDB.

[15]  Michael Spann,et al.  A new approach to clustering , 1990, Pattern Recognit..

[16]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[17]  Li Xia Improved BIRCH clustering algorithm , 2009 .

[18]  Zhao Yu Improved BIRCH Hierarchical Clustering Algorithm , 2008 .

[19]  Zhang Wei,et al.  A grid clustering algorithm based on reference and density , 2005 .

[20]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[21]  Daniel A. Keim,et al.  An Efficient Approach to Clustering in Large Multimedia Databases with Noise , 1998, KDD.

[22]  Douglas H. Fisher,et al.  Knowledge Acquisition Via Incremental Conceptual Clustering , 1987, Machine Learning.