An Efficiency K-Means Data Clustering in Cotton Textile Imports

Data clustering is a technique of finding similar characteristics among the data sets which are always hidden in nature, and dividing them into groups. The major factor influencing cluster validation is choosing the optimal number of clusters. A novel random algorithm for estimating the optimal number of clusters is introduced here. The efficiency hybrid random algorithm for good k and modified classical k-means data clustering method in cotton textile imports country clustering and ranking is described and implemented on real-world data set. The original real-world U.S. cotton textile and apparel imports data set is taken under view in this research.

[1]  Ron Shamir,et al.  Clustering Gene Expression Patterns , 1999, J. Comput. Biol..

[2]  Cuthbert Daniel,et al.  Fitting Equations to Data: Computer Analysis of Multifactor Data , 1980 .

[3]  John A. Hartigan,et al.  Clustering Algorithms , 1975 .

[4]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[5]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[6]  H. Kriegel,et al.  Spatial Data Mining: Database Primitives, Algorithms and Efficient DBMS Support , 2000, Data Mining and Knowledge Discovery.

[7]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[8]  Heikki Mannila,et al.  Probabilistic modeling of transaction data with applications to profiling, visualization, and prediction , 2001, KDD '01.

[9]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[10]  George Karypis,et al.  A Comparison of Document Clustering Techniques , 2000 .

[11]  Zohar Yakhini,et al.  Clustering gene expression patterns , 1999, J. Comput. Biol..

[12]  Joseph P. Bigus,et al.  Data mining with neural networks , 1996 .

[13]  Adam Piórkowski,et al.  Towards Precise Segmentation of Corneal Endothelial Cells , 2015, IWBBIO.

[14]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[15]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[16]  H. Akaike A new look at the statistical model identification , 1974 .

[17]  David W. Scott,et al.  Multivariate Density Estimation: Theory, Practice, and Visualization , 1992, Wiley Series in Probability and Statistics.

[18]  Robert Tibshirani,et al.  Estimating the number of clusters in a data set via the gap statistic , 2000 .

[19]  Aidong Zhang,et al.  Cluster analysis for gene expression data: a survey , 2004, IEEE Transactions on Knowledge and Data Engineering.

[20]  Giansalvatore Mecca,et al.  A new algorithm for clustering search results , 2007, Data Knowl. Eng..

[21]  F. Valafar Pattern Recognition Techniques in Microarray Data Analysis , 2002, Annals of the New York Academy of Sciences.

[22]  G. W. Milligan,et al.  An examination of procedures for determining the number of clusters in a data set , 1985 .