DHC: A Distributed Hierarchical Clustering Algorithm for Large Datasets

Hierarchical clustering is a classical method to provide a hierarchical representation for the purpose of data analysis. However, in practical applications, it is difficult to deal with massive dat...

[1]  Hanan Samet,et al.  Storing a collection of polygons using quadtrees , 1985, TOGS.

[2]  Amin A. Shoukry,et al.  CMUNE: A clustering using mutual nearest neighbors algorithm , 2012, 2012 11th International Conference on Information Science, Signal Processing and their Applications (ISSPA).

[3]  Andrew McCallum,et al.  Efficient clustering of high-dimensional data sets with application to reference matching , 2000, KDD '00.

[4]  E. Anderson The Species Problem in Iris , 1936 .

[5]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[6]  Bruce A. Draper,et al.  Efficient Label Collection for Image Datasets via Hierarchical Clustering , 2017, International Journal of Computer Vision.

[7]  Sudipto Guha,et al.  CURE: an efficient clustering algorithm for large databases , 1998, SIGMOD '98.

[8]  D. Defays,et al.  An Efficient Algorithm for a Complete Link Method , 1977, Comput. J..

[9]  Vipin Kumar,et al.  Chameleon: Hierarchical Clustering Using Dynamic Modeling , 1999, Computer.

[10]  Feng Li,et al.  An Efficient Hierarchical Clustering Method for Large Datasets with Map-Reduce , 2009, 2009 International Conference on Parallel and Distributed Computing, Applications and Technologies.

[11]  Mingsong Chen,et al.  GPU-Based Fluid Motion Estimation Using Energy Constraint , 2017, J. Circuits Syst. Comput..

[12]  Federico Tombari,et al.  Learning to Detect Good 3D Keypoints , 2017, International Journal of Computer Vision.

[13]  Chris H. Q. Ding,et al.  Weighted Consensus Clustering , 2008, SDM.

[14]  Seref Sagiroglu,et al.  The development of intuitive knowledge classifier and the modeling of domain dependent data , 2013, Knowl. Based Syst..

[15]  G. N. Lance,et al.  A general theory of classificatory sorting strategies: II. Clustering systems , 1967, Comput. J..

[16]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[17]  Robin Sibson,et al.  SLINK: An Optimally Efficient Algorithm for the Single-Link Cluster Method , 1973, Comput. J..

[18]  Ricardo J. Barrientos,et al.  Similarity (range and kNN) queries processing on an Intel Xeon Phi coprocessor , 2016, Cluster Computing.

[19]  Luís Torgo,et al.  Knowledge Discovery in Databases: PKDD 2005, 9th European Conference on Principles and Practice of Knowledge Discovery in Databases, Porto, Portugal, October 3-7, 2005, Proceedings , 2005, PKDD.

[20]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[21]  Athman Bouguettaya,et al.  Efficient agglomerative hierarchical clustering , 2015, Expert Syst. Appl..

[22]  Sungroh Yoon,et al.  NC-Link: A New Linkage Method for Efficient Hierarchical Clustering of Large-Scale Data , 2017, IEEE Access.

[23]  Jun Kong,et al.  Parallel and Efficient Sensitivity Analysis of Microscopy Image Segmentation Workflows in Hybrid Systems , 2017, 2017 IEEE International Conference on Cluster Computing (CLUSTER).

[24]  Gonzalo Navarro,et al.  Near neighbor searching with K nearest references , 2015, Inf. Syst..

[25]  Li Lin,et al.  Joint Hierarchical Category Structure Learning and Large-Scale Image Classification , 2017, IEEE Transactions on Image Processing.

[26]  Jian Hou,et al.  Parameter independent clustering based on dominant sets and cluster merging , 2017, Inf. Sci..

[27]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[28]  Tian Zhang,et al.  BIRCH: A New Data Clustering Algorithm and Its Applications , 1997, Data Mining and Knowledge Discovery.

[29]  M. Madheswaran,et al.  An improved frequency based agglomerative clustering algorithm for detecting distinct clusters on two dimensional dataset , 2017 .

[30]  H. Edelsbrunner,et al.  Efficient algorithms for agglomerative hierarchical clustering methods , 1984 .

[31]  Boris Mirkin,et al.  Mathematical Classification and Clustering: From How to What and Why , 1998 .

[32]  Alok N. Choudhary,et al.  Incremental, distributed single-linkage hierarchical clustering algorithm using mapreduce , 2015, SpringSim.

[33]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[34]  Fionn Murtagh,et al.  A Survey of Recent Advances in Hierarchical Clustering Algorithms , 1983, Comput. J..

[35]  Marek Gagolewski,et al.  Genie: A new, fast, and outlier-resistant hierarchical clustering algorithm , 2016, Inf. Sci..

[36]  Sergio Greco,et al.  An information-theoretic approach to hierarchical clustering of uncertain data , 2017, Inf. Sci..

[37]  Jae-Gil Lee,et al.  Parallel community detection on large graphs with MapReduce and GraphChi , 2016, Data Knowl. Eng..

[38]  Jon Louis Bentley,et al.  Quad trees a data structure for retrieval on composite keys , 1974, Acta Informatica.

[39]  I. Wald,et al.  On building fast kd-Trees for Ray Tracing, and on doing that in O(N log N) , 2006, 2006 IEEE Symposium on Interactive Ray Tracing.

[40]  Tongquan Wei,et al.  Fault-Tolerant Task Scheduling for Mixed-Criticality Real-Time Systems , 2017, J. Circuits Syst. Comput..