Using discretization to improve E-commerce anomaly detection process

Effective data mining solutions have been anticipated in Electronic Commerce (E-Commerce) transaction anomaly detection model to accurately predict anomaly transaction records. However, there are many sub-optimal E-Commerce transaction anomaly detection models due to highly imbalanced data set. This research paper proposes a preprocessing method based discretization of continuous variables to solve the problem of highly imbalanced data. The Logistic Regression, Naive Bayes, RBFNetwork and NBtree classifiers are applied to evaluate the discretization method. Results indicate that the discretization method can achieve excellent performance.

[1]  Jasmina Novakovic,et al.  Using Information Gain Attribute Evaluation to Classify Sonar Targets , 2009 .

[2]  Irwin King,et al.  Ensemble Learning for Imbalanced E-commerce Transaction Anomaly Classification , 2009, ICONIP.

[3]  Aristidis Protopsaltis,et al.  E-commerce transactions in a virtual environment: virtual transactions , 2012, Electron. Commer. Res..

[4]  Bo Yan,et al.  Using linear discriminant analysis and data mining approaches to identify E-commerce anomaly , 2011, 2011 Seventh International Conference on Natural Computation.

[5]  Jerzy W. Grzymala-Busse,et al.  Global discretization of continuous attributes as preprocessing for machine learning , 1996, Int. J. Approx. Reason..

[6]  Nong Ye,et al.  The Handbook of Data Mining , 2003 .

[7]  Salvatore J. Stolfo,et al.  A data mining framework for building intrusion detection models , 1999, Proceedings of the 1999 IEEE Symposium on Security and Privacy (Cat. No.99CB36344).

[8]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[9]  Alfonso Palmer,et al.  Data Mining: Machine Learning and Statistical Techniques , 2011 .

[10]  Vijayan Sugumaran Intelligent support systems : knowledge management , 2002 .

[11]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[12]  Joaquim A. Jorge,et al.  NB-Tree : An Indexing Structure for Content-Based Retrieval in Large Databases , 2003 .

[13]  Reza Khosravani A Linear Approximation to a Neural Network Model for E-Commerce Anomaly Detection , 2010 .

[14]  R Nedunchezhian,et al.  BOAT adaptive credit card fraud detection system , 2010, 2010 IEEE International Conference on Computational Intelligence and Computing Research.