Performance Evaluation of Data Mining Techniques

Data mining has gained immense popularity in various fields of medical, education and industry as well. Data mining is a process of predicting the result and extraction of useful information from huge dataset. In this paper, we have surveyed various data mining techniques. Further, performance of various data mining techniques, namely decision tree, random forest, naive Bayes, AdaBoost, multilayer perception neural network, radial basis function, sequential minimal optimization and decision stump, have been evaluated using UCI communities and crime dataset for classifying crime in US states. On the basis of results obtained, we found that the decision tree outperforms with 96.4% accuracy and minimal false-positive rate.

[1]  B.N. Lakshmi,et al.  A conceptual overview of data mining , 2011, 2011 National Conference on Innovations in Emerging Technology.

[2]  Yannis Manolopoulos,et al.  Data Mining techniques for the detection of fraudulent financial statements , 2007, Expert Syst. Appl..

[3]  M. Shouman,et al.  Using data mining techniques in heart disease diagnosis and treatment , 2012, 2012 Japan-Egypt Conference on Electronics, Communications and Computers.

[4]  Ion Lungu,et al.  Improving Decision Support Systems with Data Mining Techniques , 2012 .

[5]  YangYi,et al.  Personal health indexing based on medical examinations , 2016 .

[6]  R. Bhaskaran,et al.  A Study on Feature Selection Techniques in Educational Data Mining , 2009, ArXiv.

[7]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[8]  Brijesh Verma,et al.  Fast training of multilayer perceptrons , 1997, IEEE Trans. Neural Networks.

[9]  Sandeep Kumar Singh,et al.  Issues in data mining: A comprehensive survey , 2014 .

[10]  Pat Langley,et al.  Induction of One-Level Decision Trees , 1992, ML.

[11]  Robert E. Schapire,et al.  The Boosting Approach to Machine Learning An Overview , 2003 .

[12]  Kalina Yacef,et al.  Educational Data Mining: a Case Study , 2005, AIED.

[13]  Durga Toshniwal,et al.  A data mining framework to analyze road accident data , 2015, Journal of Big Data.

[14]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[15]  Bert Bredeweg,et al.  Proceedings of the 2005 conference on Artificial Intelligence in Education: Supporting Learning through Intelligent and Socially Informed Technology , 2005 .

[16]  Yen-Jen Oyang,et al.  Data classification with radial basis function networks based on a novel kernel density estimation algorithm , 2005, IEEE Transactions on Neural Networks.

[17]  P. K. Sinha,et al.  Pruning of Random Forest classifiers: A survey and future directions , 2012, 2012 International Conference on Data Science & Engineering (ICDSE).

[18]  Sven F. Crone,et al.  The impact of preprocessing on data mining: An evaluation of classifier sensitivity in direct marketing , 2006, Eur. J. Oper. Res..

[19]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[20]  Alex Alves Freitas,et al.  A Survey of Evolutionary Algorithms for Decision-Tree Induction , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[21]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[22]  Michael T. Manry,et al.  Recent Developments in Multilayer Perceptron Neural Networks , 2005 .

[23]  Robert E. Schapire,et al.  Explaining AdaBoost , 2013, Empirical Inference.

[24]  Simon Bernard,et al.  Random Forest Classifiers : A Survey and Future Research Directions , 2013 .

[25]  O. Akinola,et al.  EVALUATING CLASSIFICATION EFFECTIVENESS OF SEQUENTIAL MINIMAL OPTIMIZATION (SMO) ALGORITHM ON CHEMICAL PARAMETIZATION OF GRANITOIDS , 2012 .

[26]  M. Sudheep Elayidom,et al.  An Efficient CRM-Data Mining Framework for the Prediction of Customer Behaviour☆ , 2015 .

[27]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[28]  Ramón López de Mántaras,et al.  A distance-based attribute selection measure for decision tree induction , 1991, Machine Learning.

[29]  John Mark,et al.  Introduction to radial basis function networks , 1996 .

[30]  Marko Grobelnik,et al.  Knowledge discovery standards , 2008, Artificial Intelligence Review.