论文信息 - SCALABILITY ANALYSIS THROUGH AGGLOMERATIVE (HIERARCHICAL) CLUSTERING

SCALABILITY ANALYSIS THROUGH AGGLOMERATIVE (HIERARCHICAL) CLUSTERING – COMPLETE AND CENTROID ALGORITHM

Clustering is a well-known problem in statistics and engineering, namely, how to arrange a set of vectors (measurements) into a number of groups (clusters). Clustering is an important area of application for a variety of fields including data mining, statistical data analysis and vector quantization. The problem has been formulated in various ways in the machine learning, pattern recognition optimization and statistics literature. The fundamental clustering problem is that of grouping together (clustering) data items that are similar to each other. The most general approach to clustering is to view it as a density estimation problem. Classification algorithms rely on human supervision to train it to classify data into pre-defined categorical classes. The term “classification” is frequently used as an algorithm for all data mining tasks. Instead, it is best to use the term to refer to the category of supervised learning algorithms used to search interesting data patterns. While classification algorithms have become very popular and ubiquitous in DM research, it is just but one of the many types of algorithms available to solve a specific type of DM task.

N Revathy | Ms.A.Uma maheswari

[1] Hannu Toivonen,et al. Discovery of frequent patterns in large data collections , 1996 .

[2] George H. John,et al. SIPping from the Data Firehose , 1997, KDD.

[3] Heikki Mannila,et al. A Perspective on Databases and Data Mining , 1995, KDD.

[4] Alan L. Porter,et al. Research profiling: Improving the literature review , 2002, Scientometrics.

[5] Ramakrishnan Srikant,et al. Mining generalized association rules , 1995, Future Gener. Comput. Syst..

[6] Lawrence B. Holder,et al. Improving Scalability in a Scientific Discovery System by Exploiting Parallelism , 1997, KDD.

[7] Heikki Mannila,et al. Fast Discovery of Association Rules , 1996, Advances in Knowledge Discovery and Data Mining.

[8] Moustafa Ghanem,et al. Large Scale Data Mining: Challenges and Responses , 1997, KDD.

[9] Luc De Raedt,et al. Relational Knowledge Discovery in Databases , 1996, Inductive Logic Programming Workshop.

[10] Ralf-Stefan Lossack,et al. Automatic classification and creation of classificaton systems using methodologies of knowledge discovery in databases (KDD) , 2001 .