Research on semi supervised K-means clustering algorithm in data mining

K-means clustering has become an important tool for the analysis of gene expression data, which can also look for the expression of cluster with the same fluctuation from two directions of genes and conditions. But the K-means clustering is a multi-objective local search algorithm, which is easy to fall into local optimum when dealing with complex data of the gene. In order to improve the global search capability of the algorithm, this paper presents a semi supervised K clustering algorithm. Firstly, the K—means clustering algorithm is used to deal with gene data. Then the improved semi supervised K mean clustering is used for the greedy iteration to find the K mean clustering, so as to achieve better results. Through the simulation experiment, the results prove the global semi supervised K clustering algorithm has better optimization ability and better cluster effect compared with MDO algorithm.

[1]  Annemarie Schneider,et al.  Monitoring land cover change in urban and peri-urban areas using dense time stacks of Landsat satellite data and a data mining approach , 2012 .

[2]  Davide Astolfi,et al.  Data mining techniques for performance analysis of onshore wind farms , 2015 .

[3]  B. Minasny,et al.  Comparing data mining classifiers to predict spatial distribution of USDA-family soil groups in Baneh region, Iran , 2015 .

[4]  Zhaoyang Feng,et al.  Profiling a Caenorhabditis elegans behavioral parametric dataset with a supervised K-means clustering algorithm identifies genetic networks regulating locomotion , 2011, Journal of Neuroscience Methods.

[5]  Alberto Rodrigues da Silva,et al.  The Impact of Driving Styles on Fuel Consumption: A Data-Warehouse-and-Data-Mining-Based Discovery Process , 2015, IEEE Transactions on Intelligent Transportation Systems.

[6]  Igor Santos,et al.  Opcode sequences as representation of executables for data-mining-based unknown malware detection , 2013, Inf. Sci..

[7]  Andreas Schmidt,et al.  Data mining and linked open data – New perspectives for data analysis in environmental research , 2015 .

[8]  Da Ruan,et al.  Consensus clustering based on constrained self-organizing map and improved Cop-Kmeans ensemble in intelligent decision support systems , 2012, Knowl. Based Syst..

[9]  Gheorghe Grigoras,et al.  An assessment of the renewable energy potential using a clustering based data mining method. Case study in Romania. , 2015 .