Empirical Analysis of Bagged SVM Classifier for Data Mining Applications

Data mining is the use of algorithms to extract the information and patterns derived by the knowledge discovery in databases process. Classification maps data into predefined groups or classes. It is often referred to as supervised learning because the classes are determined before examining the data. The feasibility and the benefits of the proposed approaches are demonstrated by the means of data mining applications like intrusion detection, direct marketing, and signature verification. A variety of techniques have been employed for analysis ranging from traditional statistical methods to data mining approaches. Bagging and boosting are two relatively new but popular methods for producing ensembles. In this work, bagging is evaluated on real and benchmark data sets of intrusion detection, direct marketing, and signature verification in conjunction with as the base learner. The proposed is superior to individual approach for data mining applications in terms of classification accuracy.

[1]  Ajith Abraham,et al.  Intrusion Detection Using Ensemble of Soft Computing Paradigms , 2003 .

[2]  Jian-xiong Dong,et al.  Fast SVM training algorithm with decomposition on very large data sets , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Sugata Sanyal,et al.  Adaptive neuro-fuzzy intrusion detection systems , 2004, International Conference on Information Technology: Coding and Computing, 2004. Proceedings. ITCC 2004..

[4]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[5]  David J. Marchette A Statistical Method for Profiling Network Traffic , 1999, Workshop on Intrusion Detection and Network Monitoring.

[6]  Leo Breiman,et al.  Stacked regressions , 2004, Machine Learning.

[7]  Wei-Yang Lin,et al.  Intrusion detection by machine learning: A review , 2009, Expert Syst. Appl..

[8]  Richard A. Kemmerer,et al.  State Transition Analysis: A Rule-Based Intrusion Detection Approach , 1995, IEEE Trans. Software Eng..

[9]  Erland Jonsson,et al.  Anomaly-based intrusion detection: privacy concerns and other problems , 2000, Comput. Networks.

[10]  Charles X. Ling,et al.  Data Mining for Direct Marketing: Problems and Solutions , 1998, KDD.

[11]  Sargur N. Srihari,et al.  Combination of Structural Classifiers , 1990 .

[12]  Xiaohua Hu,et al.  A Data Mining Approach for Retailing Bank Customer Attrition Analysis , 2004, Applied Intelligence.

[13]  Aurobindo Sundaram,et al.  An introduction to intrusion detection , 1996, CROS.

[14]  Sungzoon Cho,et al.  Response modeling with support vector machines , 2006, Expert Syst. Appl..

[15]  Harry Zhang,et al.  The Optimality of Naive Bayes , 2004, FLAIRS.

[16]  Wei Li,et al.  Credit Card Customer Segmentation and Target Marketing Based on Data Mining , 2010, 2010 International Conference on Computational Intelligence and Security.

[17]  Rita C. Summers Secure Computing: Threats and Safeguards , 1996 .

[18]  Johan A. K. Suykens,et al.  Knowledge discovery in a direct marketing case using least squares support vector machines , 2001, Int. J. Intell. Syst..

[19]  Arthur B. Maccabe,et al.  The architecture of a network level intrusion detection system , 1990 .

[20]  Bernhard Schölkopf,et al.  Improving the Accuracy and Speed of Support Vector Machines , 1996, NIPS.

[21]  C. Apte,et al.  Data mining with decision trees and decision rules , 1997, Future Gener. Comput. Syst..

[22]  James T. Kwok,et al.  Mining customer product ratings for personalized marketing , 2003, Decis. Support Syst..

[23]  Lucas M. Venter,et al.  A comparison of Intrusion Detection systems , 2001, Comput. Secur..

[24]  Tang Ming . Wei Lian. Si Tuo Lin Si,et al.  Cryptography and Network Security - Principles and Practice , 2015 .

[25]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[26]  Ray Hunt,et al.  Intrusion detection techniques and approaches , 2002, Comput. Commun..

[27]  L. Vanajakshi,et al.  A comparison of the performance of artificial neural networks and support vector machines for the prediction of traffic speed , 2004, IEEE Intelligent Vehicles Symposium, 2004.

[28]  Wolfgang Banzhaf,et al.  The use of computational intelligence in intrusion detection systems: A review , 2010, Appl. Soft Comput..

[29]  Robert A. Lordo,et al.  Learning from Data: Concepts, Theory, and Methods , 2001, Technometrics.

[30]  Andrew H. Sung,et al.  Intrusion Detection Systems Using Adaptive Regression Splines , 2004, ICEIS.

[31]  Ching Y. Suen,et al.  Computer recognition of unconstrained handwritten numerals , 1992, Proc. IEEE.

[32]  Ulrich H.-G. Kreßel,et al.  Pairwise classification and support vector machines , 1999 .

[33]  Taeshik Shon,et al.  A hybrid machine learning approach to network anomaly detection , 2007, Inf. Sci..

[34]  Qi Tian,et al.  Feature selection using principal feature analysis , 2007, ACM Multimedia.

[35]  Andrew H. Sung,et al.  Modeling intrusion detection systems using linear genetic programming approach , 2004 .

[36]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .