Feature weighted confidence to incorporate prior knowledge into support vector machines for classification

This paper proposes an approach called feature weighted confidence with support vector machine (FWC–SVM) to incorporate prior knowledge into SVM with sample confidence. First, we use prior features to express prior knowledge. Second, FWC–SVM is biased to assign larger weights for prior weights in the slope vector $$\omega $$ω than weights corresponding to non-prior features. Third, FWC–SVM employs an adaptive paradigm to update sample confidence and feature weights iteratively. We conduct extensive experiments to compare FWC–SVM with the state-of-the-art methods including standard SVM, WSVM, and WMSVM on an English dataset as Reuters-21578 text collection and a Chinese dataset as TanCorpV1.0 text collection. Experimental results demonstrate that in case of non-noisy data, FWC–SVM outperforms other methods when the retaining level is not larger than 0.8. In case of noisy data, FWC–SVM can produce better performance than WSVM on Reuters-21578 dataset when the retaining level is larger than 0.4 and on TanCorpV1.0 dataset when the retaining level is larger than 0.5. We also discuss the strength and weakness of the proposed FWC–SVM approach.

[1]  Jason Weston,et al.  Fast Kernel Classifiers with Online and Active Learning , 2005, J. Mach. Learn. Res..

[2]  Xijin Tang,et al.  Text classification based on multi-word with support vector machine , 2008, Knowl. Based Syst..

[3]  Xijin Tang,et al.  Text clustering using frequent itemsets , 2010, Knowl. Based Syst..

[4]  Pieter Abbeel,et al.  Max-margin Classification of Data with Absent Features , 2008, J. Mach. Learn. Res..

[5]  Marcin Orchel Incorporating a Priori Knowledge from Detractor Points into Support Vector Classification , 2011, ICANNGA.

[6]  James M. Keller,et al.  The possibilistic C-means algorithm: insights and recommendations , 1996, IEEE Trans. Fuzzy Syst..

[7]  Yang Liu,et al.  A method for multi-class sentiment classification based on an improved one-vs-one (OVO) strategy and the support vector machine (SVM) algorithm , 2017, Inf. Sci..

[8]  Arash Ahmadi,et al.  Realistic Hodgkin–Huxley Axons Using Stochastic Behavior of Memristors , 2017, Neural Processing Letters.

[9]  Sonali Agarwal,et al.  Hybrid Feature Selection Based Weighted Least Squares Twin Support Vector Machine Approach for Diagnosing Breast Cancer, Hepatitis, and Diabetes , 2015, Adv. Artif. Neural Syst..

[10]  Huimin Xiao,et al.  Online Learning Algorithms for Double-Weighted Least Squares Twin Bounded Support Vector Machines , 2017, Neural Processing Letters.

[11]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2004 .

[12]  Sheng-De Wang,et al.  Fuzzy support vector machines , 2002, IEEE Trans. Neural Networks.

[13]  Xijin Tang,et al.  Using ontology to improve precision of terminology extraction from documents , 2009, Expert Syst. Appl..

[14]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[15]  Gerard Salton,et al.  On the Specification of Term Values in Automatic Indexing , 1973 .

[16]  H. B. Mann,et al.  On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other , 1947 .

[17]  Yiming Yang,et al.  A re-examination of text categorization methods , 1999, SIGIR '99.

[18]  Jian Yang,et al.  A weighted one-class support vector machine , 2016, Neurocomputing.

[19]  Rohini K. Srihari,et al.  Incorporating prior knowledge with weighted margin support vector machines , 2004, KDD.

[20]  Li Zhang,et al.  Density-induced margin support vector machines , 2011, Pattern Recognit..

[21]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[22]  Nathalie Japkowicz,et al.  The class imbalance problem: A systematic study , 2002, Intell. Data Anal..

[23]  Naonori Ueda,et al.  Improving Classifier Performance Using Data with Different Taxonomies , 2011, IEEE Transactions on Knowledge and Data Engineering.

[24]  Gert Cauwenberghs,et al.  Incremental and Decremental Support Vector Machine Learning , 2000, NIPS.

[25]  Gérard Bloch,et al.  Incorporating prior knowledge in support vector machines for classification: A review , 2008, Neurocomputing.

[26]  Jude W. Shavlik,et al.  Online Knowledge-Based Support Vector Machines , 2010, ECML/PKDD.

[27]  Tomaso Poggio,et al.  Incorporating prior information in machine learning by creating virtual examples , 1998, Proc. IEEE.

[28]  J. Shao Linear Model Selection by Cross-validation , 1993 .

[29]  V. Vapnik Estimation of Dependences Based on Empirical Data , 2006 .

[30]  Naftali Tishby,et al.  Incorporating Prior Knowledge on Features into Learning , 2007, AISTATS.