Data mining platform-WEKA and secondary development on WEKA

The paper does some tests about data mining on WEKA which is an open source data mining tool,and analyzes the test results and indicates the problems of the WEKA system.In order to overcome the weakness of clustering in the WEKA system,the paper makes secondary development under the WEKA platform to extend the clustering algorithms.The paper introduces the process of embedding the k-medoids substitution method into the WEKA in which the classes and visualization functions of open source WEKA are fully utilized.The paper makes comparison between the embedded algorithm and initial algorithm.The k-medoids substitution method improves the accuracy on the traditional k-medoids method,preventing it from getting into partial optimal solution.Moreover,this method is insensitive to the initial points,with obtaining better clustering results.