Density‐based clustering

Clustering refers to the task of identifying groups or clusters in a data set. In density‐based clustering, a cluster is a set of data objects spread in the data space over a contiguous region of high density of objects. Density‐based clusters are separated from each other by contiguous regions of low density of objects. Data objects located in low‐density regions are typically considered noise or outliers. © 2011 John Wiley & Sons, Inc. WIREs Data Mining Knowl Discov 2011 1 231–240 DOI: 10.1002/widm.30

[1]  Ira Assent,et al.  Clustering high dimensional data , 2012 .

[2]  Hans-Peter Kriegel,et al.  Data bubbles: quality preserving performance boosting for hierarchical clustering , 2001, SIGMOD '01.

[3]  Jörg Sander,et al.  Semi-supervised Density-Based Clustering , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[4]  Daniel Barbará,et al.  Requirements for clustering data streams , 2002, SKDD.

[5]  Hans-Peter Kriegel,et al.  Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications , 1998, Data Mining and Knowledge Discovery.

[6]  Michalis Vazirgiannis,et al.  A density-based cluster validity approach using multi-representatives , 2008, Pattern Recognit. Lett..

[7]  Elke Achtert,et al.  Robust, Complete, and Efficient Correlation Clustering , 2007, SDM.

[8]  Fionn Murtagh,et al.  A Survey of Algorithms for Contiguity-Constrained Clustering and Related Problems , 1985, Comput. J..

[9]  L. Devroye,et al.  The Strong Uniform Consistency of Nearest Neighbor Density Estimates. , 1977 .

[10]  J. Hartigan Statistical theory in clustering , 1985 .

[11]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[12]  J. Yackel,et al.  Consistency Properties of Nearest Neighbor Density Function Estimators , 1977 .

[13]  Robin Sibson,et al.  SLINK: An Optimally Efficient Algorithm for the Single-Link Cluster Method , 1973, Comput. J..

[14]  Ira Assent,et al.  EDSC: efficient density-based subspace clustering , 2008, CIKM '08.

[15]  Vipin Kumar,et al.  Finding Clusters of Different Sizes, Shapes, and Densities in Noisy, High Dimensional Data , 2003, SDM.

[16]  J. Carmichael,et al.  FINDING NATURAL CLUSTERS , 1968 .

[17]  Daniel A. Keim,et al.  An Efficient Approach to Clustering in Large Multimedia Databases with Noise , 1998, KDD.

[18]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[19]  C. Quesenberry,et al.  A nonparametric estimate of a multivariate density function , 1965 .

[20]  Hans-Peter Kriegel,et al.  Density-Connected Subspace Clustering for High-Dimensional Data , 2004, SDM.

[21]  Tommy W. S. Chow,et al.  Clustering of the self-organizing map using a clustering validity index based on inter-cluster and intra-cluster density , 2004, Pattern Recognit..

[22]  J. Hartigan Direct Clustering of a Data Matrix , 1972 .

[23]  David M. Raup,et al.  Geometric analysis of shell coiling; general problems , 1966 .

[24]  Robin Sibson,et al.  The Construction of Hierarchic and Non-Hierarchic Classifications , 1968, Comput. J..

[25]  Hans-Peter Kriegel,et al.  OPTICS: ordering points to identify the clustering structure , 1999, SIGMOD '99.

[26]  Hans-Peter Kriegel,et al.  A survey on unsupervised outlier detection in high‐dimensional numerical data , 2012, Stat. Anal. Data Min..

[27]  Elke Achtert,et al.  On Exploring Complex Relationships of Correlation Clusters , 2007, 19th International Conference on Scientific and Statistical Database Management (SSDBM 2007).

[28]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[29]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[30]  Thomas Seidl,et al.  Subspace correlation clustering: finding locally correlated dimensions in subspace projections of the data , 2012, KDD.

[31]  Ira Assent,et al.  DUSC: Dimensionality Unbiased Subspace Clustering , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[32]  Zhiyong Lu,et al.  Automatic Extraction of Clusters from Hierarchical Clustering Representations , 2003, PAKDD.

[33]  G. N. Lance,et al.  A General Theory of Classificatory Sorting Strategies: 1. Hierarchical Systems , 1967, Comput. J..

[34]  Christian Böhm,et al.  Computing Clusters of Correlation Connected objects , 2004, SIGMOD '04.

[35]  Christian Böhm,et al.  Density connected clustering with local subspace preferences , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[36]  W. T. Williams,et al.  Multivariate Methods in Plant Ecology: V. Similarity Analyses and Information-Analysis , 1966 .

[37]  Christian Böhm,et al.  HISSCLU: a hierarchical density-based method for semi-supervised clustering , 2008, EDBT '08.

[38]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[39]  M. Rosenblatt Remarks on Some Nonparametric Estimates of a Density Function , 1956 .

[40]  A. Cuevas,et al.  Cluster analysis: a further approach based on density estimation , 2001 .

[41]  Werner Stuetzle,et al.  Estimating the Cluster Tree of a Density by Analyzing the Minimal Spanning Tree of a Sample , 2003, J. Classif..

[42]  P. Sneath The application of computers to taxonomy. , 1957, Journal of general microbiology.

[43]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[44]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[45]  Hans-Peter Kriegel,et al.  Density-based Projected Clustering over High Dimensional Data Streams , 2012, SDM.

[46]  Vincent Kanade,et al.  Clustering Algorithms , 2021, Wireless RF Energy Transfer in the Massive IoT Era.

[47]  Elke Achtert,et al.  Visual Evaluation of Outlier Detection Models , 2010, DASFAA.

[48]  Hans-Peter Kriegel,et al.  Subspace clustering , 2012, WIREs Data Mining Knowl. Discov..

[49]  Ricardo J. G. B. Campello,et al.  Relative clustering validity criteria: A comparative overview , 2010, Stat. Anal. Data Min..

[50]  Ira Assent,et al.  Relevant Subspace Clustering: Mining the Most Interesting Non-redundant Concepts in High Dimensional Data , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[51]  Hans-Peter Kriegel,et al.  Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering , 2009, TKDD.

[52]  W. Fitch,et al.  Construction of phylogenetic trees. , 1967, Science.

[53]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[54]  Chenghu Zhou,et al.  DECODE: a new method for discovering clusters of different densities in spatial data , 2009, Data Mining and Knowledge Discovery.

[55]  Hans-Peter Kriegel,et al.  Can Shared-Neighbor Distances Defeat the Curse of Dimensionality? , 2010, SSDBM.