Cluster Analysis: Modeling Groups in Text

This chapter explains the unsupervised learning method of grouping data known as cluster analysis. The chapter shows how hierarchical and k-means clustering can place text or documents into significant groups to increase the understanding of the data. Clustering is a valuable tool that helps us find naturally occurring similarities.

[1]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[2]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[3]  Michael J. A. Berry,et al.  Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management , 2004 .

[4]  Douglas Steinley,et al.  K-means clustering: a half-century synthesis. , 2006, The British journal of mathematical and statistical psychology.

[5]  George Karypis,et al.  Hierarchical Clustering Algorithms for Document Datasets , 2005, Data Mining and Knowledge Discovery.

[6]  Charu C. Aggarwal,et al.  Mining Text Data , 2012 .

[7]  Mohammed J. Zaki Data Mining and Analysis: Fundamental Concepts and Algorithms , 2014 .

[8]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Fionn Murtagh,et al.  A Survey of Recent Advances in Hierarchical Clustering Algorithms , 1983, Comput. J..

[10]  George Nagy,et al.  State of the art in pattern recognition , 1968 .

[11]  Michalis Vazirgiannis,et al.  Cluster validity methods: part I , 2002, SGMD.

[12]  Ellen M. Voorhees,et al.  Implementing agglomerative hierarchic clustering algorithms for use in document retrieval , 1986, Inf. Process. Manag..

[13]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[14]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[15]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.