An enhanced Support Vector Machine classification framework by using Euclidean distance function for text document categorization

This paper presents the implementation of a new text document classification framework that uses the Support Vector Machine (SVM) approach in the training phase and the Euclidean distance function in the classification phase, coined as Euclidean-SVM. The SVM constructs a classifier by generating a decision surface, namely the optimal separating hyper-plane, to partition different categories of data points in the vector space. The concept of the optimal separating hyper-plane can be generalized for the non-linearly separable cases by introducing kernel functions to map the data points from the input space into a high dimensional feature space so that they could be separated by a linear hyper-plane. This characteristic causes the implementation of different kernel functions to have a high impact on the classification accuracy of the SVM. Other than the kernel functions, the value of soft margin parameter, C is another critical component in determining the performance of the SVM classifier. Hence, one of the critical problems of the conventional SVM classification framework is the necessity of determining the appropriate kernel function and the appropriate value of parameter C for different datasets of varying characteristics, in order to guarantee high accuracy of the classifier. In this paper, we introduce a distance measurement technique, using the Euclidean distance function to replace the optimal separating hyper-plane as the classification decision making function in the SVM. In our approach, the support vectors for each category are identified from the training data points during training phase using the SVM. In the classification phase, when a new data point is mapped into the original vector space, the average distances between the new data point and the support vectors from different categories are measured using the Euclidean distance function. The classification decision is made based on the category of support vectors which has the lowest average distance with the new data point, and this makes the classification decision irrespective of the efficacy of hyper-plane formed by applying the particular kernel function and soft margin parameter. We tested our proposed framework using several text datasets. The experimental results show that this approach makes the accuracy of the Euclidean-SVM text classifier to have a low impact on the implementation of kernel functions and soft margin parameter C.

[1]  Sang-Bum Kim,et al.  Effective Methods for Improving Naive Bayes Text Classifiers , 2002, PRICAI.

[2]  Nello Cristianini,et al.  Advances in Kernel Methods - Support Vector Learning , 1999 .

[3]  Sholom M. Weiss,et al.  Automated learning of decision rules for text categorization , 1994, TOIS.

[4]  Houkuan Huang,et al.  Feature selection for text classification with Naïve Bayes , 2009, Expert Syst. Appl..

[5]  Constantine D. Spyropoulos,et al.  An experimental comparison of naive Bayesian and keyword-based anti-spam filtering with personal e-mail messages , 2000, SIGIR '00.

[6]  Tom M. Mitchell,et al.  Learning to Extract Symbolic Knowledge from the World Wide Web , 1998, AAAI/IAAI.

[7]  Juan José Rodríguez Diez,et al.  Random projections for linear SVM ensembles , 2011, Applied Intelligence.

[8]  Dino Isa,et al.  Text Document Preprocessing with the Bayes Formula for Classification Using the Support Vector Machine , 2008, IEEE Transactions on Knowledge and Data Engineering.

[9]  Carl Vogel,et al.  Spam filters: bayes vs. chi-squared; letters vs. words , 2003, ISICT.

[10]  Kevin Françoisse,et al.  Semi-supervised Classification from Discriminative Random Walks , 2008, ECML/PKDD.

[11]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[12]  Xiusheng Duan,et al.  Parameters optimization of Support Vector Machine based on Simulated Annealing and Genetic Algorithm , 2009, 2009 IEEE International Conference on Robotics and Biomimetics (ROBIO).

[13]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[14]  Jörg Kindermann,et al.  Authorship Attribution with Support Vector Machines , 2003, Applied Intelligence.

[15]  T. E. Buck SVM Kernel Optimization : An Example in Yeast Protein Subcellular Localization Prediction , 2006 .

[16]  Dino Isa,et al.  Automatic folder allocation system using Bayesian-support vector machines hybrid classification approach , 2010, Applied Intelligence.

[17]  Ethem Alpaydin,et al.  Introduction to machine learning , 2004, Adaptive computation and machine learning.

[18]  Andrew McCallum,et al.  A comparison of event models for naive bayes text classification , 1998, AAAI 1998.

[19]  Yung-Ming Li,et al.  Building a qualitative recruitment system via SVM with MCDM approach , 2011, Applied Intelligence.

[20]  Jiancheng Sun,et al.  Fast tuning of SVM kernel parameter using distance between two classes , 2008, 2008 3rd International Conference on Intelligent System and Knowledge Engineering.

[21]  R. Tibshirani,et al.  An introduction to the bootstrap , 1993 .

[22]  Jung-Hsien Chiang,et al.  A new maximal-margin spherical-structured multi-class support vector machine , 2009, Applied Intelligence.

[23]  Ah-Hwee Tan,et al.  On Machine Learning Methods for Chinese Document Categorization , 2003, Applied Intelligence.

[24]  Wasfi G. Al-Khatib,et al.  Recognition of Arabic (Indian) bank check digits using log-gabor filters , 2011, Applied Intelligence.

[25]  Chih-Ming Chen,et al.  A Hierarchical Neural Network Document Classifier with Linguistic Feature Selection , 2005, Applied Intelligence.

[26]  Ilias Maglogiannis,et al.  An intelligent system for automated breast cancer diagnosis and prognosis using SVM based classifiers , 2009, Applied Intelligence.

[27]  Thomas Hofmann,et al.  Predicting CNS Permeability of Drug Molecules: Comparison of Neural Network and Support Vector Machine Algorithms , 2002, J. Comput. Biol..

[28]  Shourya Roy,et al.  Fast and accurate text classification via multiple linear discriminant projections , 2003, The VLDB Journal.

[29]  Susan T. Dumais,et al.  A Bayesian Approach to Filtering Junk E-Mail , 1998, AAAI 1998.

[30]  Andrew McCallum,et al.  Using Maximum Entropy for Text Classification , 1999 .

[31]  Thorsten Joachims,et al.  Learning to classify text using support vector machines - methods, theory and algorithms , 2002, The Kluwer international series in engineering and computer science.

[32]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[33]  Sheng-De Wang,et al.  Choosing the kernel parameters for support vector machines by the inter-cluster distance in the feature space , 2009, Pattern Recognit..

[34]  Tom M. Mitchell,et al.  Learning to construct knowledge bases from the World Wide Web , 2000, Artif. Intell..

[35]  Wen-Hua Ju,et al.  Sparse Bayesian Classifiers for Text Categorization ( U ) , 2003 .

[36]  Engin Avci Selecting of the optimal feature subset and kernel parameters in digital modulation classification by using hybrid genetic algorithm-support vector machines: HGASVM , 2009, Expert Syst. Appl..

[37]  Dino Isa,et al.  Automatically computed document dependent weighting factor facility for Naïve Bayes classification , 2010, Expert Syst. Appl..

[38]  Xizhao Wang,et al.  Optimization of combined kernel function for SVM based on large margin learning theory , 2008, 2008 IEEE International Conference on Systems, Man and Cybernetics.

[39]  Thorsten Joachims,et al.  Making large-scale support vector machine learning practical , 1999 .

[40]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[41]  Tim Oates,et al.  Discovering Domain-Specific Composite Kernels , 2005, AAAI.

[42]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[43]  Laura Diosan,et al.  Improving classification performance of Support Vector Machine by genetically optimising kernel shape and hyper-parameters , 2010, Applied Intelligence.

[44]  Dino Isa,et al.  Using the self organizing map for clustering of text documents , 2009, Expert Syst. Appl..

[45]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[46]  Cunhe Li,et al.  The incremental learning algorithm with support vector machine based on hyperplane-distance , 2011, Applied Intelligence.

[48]  Zhi-Hua Zhou,et al.  Distributional Features for Text Categorization , 2006, IEEE Transactions on Knowledge and Data Engineering.

[49]  Hsin-Chang Yang,et al.  A Multilingual Text Mining Approach Based on Self-Organizing Maps , 2004, Applied Intelligence.

[50]  Dell Zhang,et al.  A new kernel for classification of networked entities , 2008 .

[51]  Vipin Kumar,et al.  Text Categorization Using Weight Adjusted k-Nearest Neighbor Classification , 2001, PAKDD.

[52]  Pavel Brazdil,et al.  Proceedings of the European Conference on Machine Learning , 1993 .

[53]  Marios S. Pattichis,et al.  Classification of atherosclerotic carotid plaques using morphological analysis on ultrasound images , 2009, Applied Intelligence.

[54]  KocsorAndrás,et al.  Application of Kernel-Based Feature Space Transformations and Learning Methods to Phoneme Classification , 2004 .

[55]  Charles P. Staelin Parameter selection for support vector machines , 2002 .

[56]  László Tóth,et al.  Application of Kernel-Based Feature Space Transformations and Learning Methods to Phoneme Classification , 2004, Applied Intelligence.

[57]  Sholom M. Weiss,et al.  Towards language independent automated learning of text categorization models , 1994, SIGIR '94.

[58]  Youngjoong Ko,et al.  Text classification from unlabeled documents with bootstrapping and feature projection techniques , 2009, Inf. Process. Manag..

[59]  Shenghuo Zhu,et al.  Text categorization via generalized discriminant analysis , 2008, Inf. Process. Manag..

[60]  Xing Li,et al.  Evolving support vector machine parameters , 2002, Proceedings. International Conference on Machine Learning and Cybernetics.

[61]  Xiaohong Guan,et al.  An SVM-based machine learning method for accurate internet traffic classification , 2010, Inf. Syst. Frontiers.

[62]  Christian Igel,et al.  Evolutionary tuning of multiple SVM parameters , 2005, ESANN.

[63]  Yiming Yang,et al.  A re-examination of text categorization methods , 1999, SIGIR '99.

[64]  Bin Yu,et al.  A dynamic holding strategy in public transit systems with real-time information , 2009, Applied Intelligence.

[65]  Sung Deok Cha,et al.  Empirical evaluation of SVM-based masquerade detection using UNIX commands , 2005, Comput. Secur..

[66]  Dino Isa,et al.  Tournament Structure Ranking Techniques for Bayesian Text Classification with Highly Similar Categories , 2010 .

[67]  Zhonghang Xia,et al.  An optimization method for selecting parameters in support vector machines , 2007, ICMLA 2007.

[68]  Yatong Zhou,et al.  Analysis of the Distance Between Two Classes for Tuning SVM Hyperparameters , 2010, IEEE Transactions on Neural Networks.

[69]  Igor Kononenko,et al.  Estimation of individual prediction reliability using the local sensitivity analysis , 2008, Applied Intelligence.