论文信息 - Optimization and Simplification of Hierarchical Clusterings

Optimization and Simplification of Hierarchical Clusterings

Clustering is often used to discover structure in data. Clustering systems differ in the objective function used to evaluate clustering quality and the control strategy used to search the space of clusterings. In general, a search strategy cannot both (1) consistently construct clusterings of high quality and (2) be computationally inexpensive. However, we can partition the search so that a system inexpensively constructs 'tentative' clusterings for initial examination, followed by iterative optimization, which continues to search in background for improved clusterings. This paper evaluates hierarchical redistribution, which appears to be a novel optimization strategy in the clustering literature. A final component of search prunes tree-structured clusterings, thus simplifying them for analysis. In particular, resampling is used to significantly simplify hierarchical clusterings.

Douglas Fisher

[1] J. Ross Quinlan,et al. Simplifying Decision Trees , 1987, Int. J. Man Mach. Stud..

[2] M. Pazzani,et al. Concept formation knowledge and experience in unsupervised learning , 1991 .

[3] Paul E. Utgoff,et al. An Improved Algorithm for Incremental Induction of Decision Trees , 1994, ICML.

[4] Jerry B. Weinberg,et al. ITERATE: A Conceptual Clustering Method for Knowledge Discovery in Databases , 1994 .

[5] Glenn A. Iba,et al. A heuristic approach to the discovery of macro-operators , 2004, Machine Learning.

[6] Douglas H. Fisher,et al. Iterative Optimization and Simplification of Hierarchical Clusterings , 1996, J. Artif. Intell. Res..

[7] M. Gluck,et al. Explaining Basic Categories: Feature Predictability and Information , 1992 .

[8] James Kelly,et al. AutoClass: A Bayesian Classification System , 1993, ML.

[9] Arthur J. Nevins. A branch and bound incremental conceptual clusterer , 1995, Machine Learning.

[10] Sholom M. Weiss,et al. Computer Systems That Learn , 1990 .