Incremental one-class classifier based on convex–concave hull

One subject that has been considered less is a binary classification on data streams with concept drifting in which only information of one class (target class) is available for learning. Well-known methods such as SVDD and convex hull have tried to find the enclosed boundary around target class, but their high complexity makes them unsuitable for large data sets and also online tasks. This paper presents a novel online one-class classifier adapted to the streaming data. Considering time complexity, an incremental convex–concave hull classification method, called ICCHC, is proposed which can significantly reduce the computational time and expand the target class boundary. Also, it can be adapted to the gradual concept drift. Evaluations have been conducted on seventeen real-world data sets by hold-out validation. Also, noise analysis has been carried out. The results of the experiments have been compared with the state-of-the-art methods, which show the superiority of ICCHC regarding the accuracy, precision, and recall metrics.

[1]  Peter L. Bartlett,et al.  Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..

[2]  Maliha S. Nash,et al.  Handbook of Parametric and Nonparametric Statistical Procedures , 2001, Technometrics.

[3]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[4]  S. García,et al.  Online entropy-based discretization for data streaming classification , 2018, Future generations computer systems.

[5]  Xingquan Zhu,et al.  Class Noise vs. Attribute Noise: A Quantitative Study , 2003, Artificial Intelligence Review.

[6]  Spiros Skiadopoulos,et al.  FML-kNN: scalable machine learning on Big Data using k-nearest neighbor joins , 2018, Journal of Big Data.

[7]  Francisco Herrera,et al.  FRPS: A Fuzzy Rough Prototype Selection method , 2013, Pattern Recognit..

[8]  Theresa Beaubouef,et al.  Rough Sets , 2019, Lecture Notes in Computer Science.

[9]  Philip S. Yu,et al.  One-class learning and concept summarization for data streams , 2011, Knowledge and Information Systems.

[10]  Javad Hamidzadeh,et al.  Automatic support vector data description , 2016, Soft Computing.

[11]  Anna Maria Radzikowska,et al.  A comparative study of fuzzy rough sets , 2002, Fuzzy Sets Syst..

[12]  I Barany,et al.  A GENERALIZATION OF CARATHEODORYS THEOREM , 1982 .

[13]  Dominik Olszewski,et al.  A probabilistic approach to fraud detection in telecommunications , 2012, Knowl. Based Syst..

[14]  S. Asharaf,et al.  Deep kernel learning in core vector machines , 2017, Pattern Analysis and Applications.

[15]  Radhakrishnan Nagarajan,et al.  Selective voting in convex-hull ensembles improves classification accuracy , 2012, Artif. Intell. Medicine.

[16]  Dominik Olszewski,et al.  Fraud detection using self-organizing map visualizing the user profiles , 2014, Knowl. Based Syst..

[17]  Latifur Khan,et al.  Tutorial: Data Stream Mining and Its Applications , 2012, DASFAA.

[18]  Junsheng Cheng,et al.  One-class classification based on the convex hull for bearing fault detection , 2016 .

[19]  George J. Klir,et al.  Fuzzy Sets, Fuzzy Logic, and Fuzzy Systems - Selected Papers by Lotfi A Zadeh , 1996, Advances in Fuzzy Systems - Applications and Theory.

[20]  Eric J. Pauwels,et al.  One Class Classification for Anomaly Detection: Support Vector Data Description Revisited , 2011, ICDM.

[21]  Yuan-Chun Jiang,et al.  Maximizing customer satisfaction through an online recommendation system: A novel associative classification model , 2010, Decis. Support Syst..

[22]  Ricardo Sousa,et al.  Multi-label classification from high-speed data streams with adaptive model rules and random rules , 2018, Progress in Artificial Intelligence.

[23]  Maria E. Orlowska,et al.  One-Class Classification of Text Streams with Concept Drift , 2008, 2008 IEEE International Conference on Data Mining Workshops.

[24]  M. Moradi,et al.  Ensemble-based Top-k Recommender System Considering Incomplete Data , 2019 .

[25]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[26]  Chuan-Ming Liu,et al.  Probabilistic reverse nearest neighbors on uncertain data streams , 2018, 2018 7th International Symposium on Next Generation Electronics (ISNE).

[27]  Petia Radeva,et al.  Approximate polytope ensemble for one-class classification , 2014, Pattern Recognit..

[28]  Ryan M. Bowen,et al.  Online Novelty Detection System: One-Class Classification of Systemic Operation , 2016 .

[29]  Sattar Hashemi,et al.  Flexible decision tree for data stream classification in the presence of concept change, noise and missing values , 2009, Data Mining and Knowledge Discovery.

[30]  B. Bhattacharya Application of computational geometry to pattern recognition problems , 1982 .

[31]  Enrique F. Castillo,et al.  Distributed One-Class Support Vector Machine , 2015, Int. J. Neural Syst..

[32]  Philip S. Yu,et al.  Uncertain One-Class Learning and Concept Summarization Learning on Uncertain Data Streams , 2014, IEEE Transactions on Knowledge and Data Engineering.

[33]  Katelyn Gao Online One-Class SVMs with Active-Set Optimization for Data Streams , 2015, 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA).

[34]  Yuanyuan Jiang,et al.  Fault diagnosis of analog circuit based on a second map SVDD , 2015 .

[35]  Bernhard Sick,et al.  Performing event detection in time series with SwiftEvent: an algorithm with supervised learning of detection criteria , 2018, Pattern Analysis and Applications.

[36]  Amit Prakash Singh,et al.  An empirical evaluation of translational and rotational invariance of descriptors and the classification of flower dataset , 2018, Pattern Analysis and Applications.

[37]  Ming Zeng,et al.  Maximum margin classification based on flexible convex hulls , 2015, Neurocomputing.

[38]  P. C. Siddalingaswamy,et al.  A pixel processing approach for retinal vessel extraction using modified Gabor functions , 2018, Progress in Artificial Intelligence.

[39]  Guoyou Wang,et al.  A Novel Geometric Approach to Binary Classification Based on Scaled Convex Hulls , 2009, IEEE Transactions on Neural Networks.

[40]  Bernard Chazelle,et al.  An optimal convex hull algorithm in any fixed dimension , 1993, Discret. Comput. Geom..

[41]  Nicolas Saunier,et al.  Creating ensemble classifiers through order and incremental data selection in a stream , 2013, Pattern Analysis and Applications.

[42]  Saeid Homayouni,et al.  Object-based classification of hyperspectral data using Random Forest algorithm , 2018, Geo spatial Inf. Sci..

[43]  Charu C. Aggarwal,et al.  A Survey of Stream Classification Algorithms , 2014, Data Classification: Algorithms and Applications.

[44]  Mineichi Kudo,et al.  Margin Preserved Approximate Convex Hulls for Classification , 2010, 2010 20th International Conference on Pattern Recognition.

[45]  Ming Zeng,et al.  A generalized Mitchell-Dem'yanov-Malozemov algorithm for one-class support vector machine , 2016, Knowl. Based Syst..

[46]  Asdrúbal López Chau,et al.  Convex and concave hulls for classification with support vector machine , 2013, Neurocomputing.

[47]  Joachim Denzler,et al.  One-class classification with Gaussian processes , 2010, Pattern Recognit..

[48]  Zümray Dokur,et al.  Respiratory sound classification by using an incremental supervised neural network , 2009, Pattern Analysis and Applications.

[49]  Lev V. Utkin,et al.  An one-class classification support vector machine model by interval-valued training data , 2017, Knowl. Based Syst..

[50]  Eamonn J. Keogh,et al.  DTW-D: time series semi-supervised learning from a single example , 2013, KDD.

[51]  Alejandro Cervantes,et al.  An online classification algorithm for large scale data streams: iGNGSVM , 2017, Neurocomputing.

[52]  Hong Qiao,et al.  A Fast Algorithm of Convex Hull Vertices Selection for Online Classification , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[53]  Javier Del Ser,et al.  DRED: An evolutionary diversity generation method for concept drift adaptation in online learning environments , 2017, Appl. Soft Comput..

[54]  Yong Zhang,et al.  A Fuzzy support vector classifier based on Bayesian optimization , 2008, Fuzzy Optim. Decis. Mak..

[55]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  An online adaptive classifier ensemble for mining non-stationary data streams , 2018, Intell. Data Anal..

[56]  António E. Ruano,et al.  A Randomized Approximation Convex Hull Algorithm for High Dimensions , 2015 .

[57]  Imre Bárány,et al.  A generalization of carathéodory's theorem , 1982, Discret. Math..

[58]  Mário A. T. Figueiredo,et al.  Soft clustering using weighted one-class support vector machines , 2009, Pattern Recognit..

[59]  Gert Cauwenberghs,et al.  Incremental and Decremental Support Vector Machine Learning , 2000, NIPS.

[60]  Malik Yousef,et al.  One-Class SVMs for Document Classification , 2002, J. Mach. Learn. Res..

[61]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[62]  Gareth J. F. Jones,et al.  Using online linear classifiers to filter spam emails , 2006, Pattern Analysis and Applications.

[63]  Babak Nadjar Araabi,et al.  Fast evolving neuro-fuzzy model and its application in online classification and time series prediction , 2011, Pattern Analysis and Applications.

[64]  Limeng Cui,et al.  A Method based on One-class SVM for News Recommendation , 2014, ITQM.

[65]  Phillip A. Laplante,et al.  A rough set-based approach to handling spatial uncertainty in binary images , 2004, Eng. Appl. Artif. Intell..

[66]  Geoff Holmes,et al.  New ensemble methods for evolving data streams , 2009, KDD.

[67]  Javad Hamidzadeh,et al.  New Hermite orthogonal polynomial kernel and combined kernels in Support Vector Machine classifier , 2016, Pattern Recognit..

[68]  Valquiria Aparecida Rosa Duarte,et al.  A multiagent player system composed by expert agents in specific game stages operating in high performance environment , 2017, Applied Intelligence.

[69]  Srimanta Pal,et al.  Neurocomputing Model for Computation of an Approximate Convex Hull of a Set of Points and Spheres , 2007, IEEE Transactions on Neural Networks.

[70]  Philip S. Yu,et al.  Partially Supervised Classification of Text Documents , 2002, ICML.

[71]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[72]  Chunguo Wu,et al.  Self-adaptive SVDD integrated with AP clustering for one-class classification , 2016, Pattern Recognit. Lett..

[73]  Yong Wang,et al.  A learning algorithm for one-class data stream classification based on ensemble classifier , 2010, 2010 International Conference on Computer Application and System Modeling (ICCASM 2010).

[74]  Xindong Wu,et al.  Vague One-Class Learning for Data Streams , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[75]  Bartosz Krawczyk,et al.  Online ensemble learning with abstaining classifiers for drifting and noisy data streams , 2017, Appl. Soft Comput..

[76]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[77]  A Arendra,et al.  On-line Tool Wear Detection on DCMT070204 Carbide Tool Tip Based on Noise Cutting Audio Signal using Artificial Neural Network , 2018 .

[78]  Gang Yin,et al.  Online fault diagnosis method based on Incremental Support Vector Data Description and Extreme Learning Machine with incremental output structure , 2014, Neurocomputing.

[79]  Siegfried Handschuh,et al.  From raw publications to Linked Data , 2011, Knowledge and Information Systems.

[80]  Geoff Holmes,et al.  Streaming Data Mining with Massive Online Analytics (MOA) , 2018 .

[81]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[82]  Chandan Srivastava,et al.  Support Vector Data Description , 2011 .

[83]  Piotr Duda,et al.  Convergent Time-Varying Regression Models for Data Streams: Tracking Concept Drift by the Recursive Parzen-Based Generalized Regression Neural Networks , 2017, Int. J. Neural Syst..

[84]  Philip R. Thrift,et al.  Hybrid neural network classifiers for automatic target detection , 1993 .

[85]  Min Wang,et al.  Online Support Vector Machine Based on Convex Hull Vertices Selection , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[86]  Longbing Cao,et al.  SVDD-based outlier detection on uncertain data , 2012, Knowledge and Information Systems.

[87]  Dan Wang,et al.  Anomaly detection based on probability density function with Kullback-Leibler divergence , 2016, Signal Process..

[88]  Javad Hamidzadeh,et al.  Improved one-class classification using filled function , 2018, Applied Intelligence.

[89]  Hadi Sadoghi Yazdi,et al.  IRAHC: Instance Reduction Algorithm using Hyperrectangle Clustering , 2015, Pattern Recognit..

[90]  D. Dubois,et al.  ROUGH FUZZY SETS AND FUZZY ROUGH SETS , 1990 .

[91]  Plamen P. Angelov,et al.  Semi-supervised deep rule-based approach for image classification , 2018, Appl. Soft Comput..

[92]  Xianxia Zhang,et al.  An incremental convex hull algorithm based online Support Vector Regression , 2015, 2015 34th Chinese Control Conference (CCC).

[93]  Javad Hamidzadeh,et al.  Enhancing data analysis: uncertainty-resistance method for handling incomplete data , 2019, Applied Intelligence.

[94]  Seoung Bum Kim,et al.  Recursive partitioning clustering tree algorithm , 2014, Pattern Analysis and Applications.

[95]  S. Mallika,et al.  Online transaction fraud detection techniques: A review of data mining approaches , 2016, 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom).

[96]  Zhiping Lin,et al.  Kernel based online learning for imbalance multiclass classification , 2018, Neurocomputing.

[97]  Geoff Holmes,et al.  The online performance estimation framework: heterogeneous ensemble learning for data streams , 2017, Machine Learning.

[98]  Bartosz Krawczyk,et al.  Incremental weighted one-class classifier for mining stationary data streams , 2015, J. Comput. Sci..

[99]  Sattar Hashemi,et al.  Adapted One-versus-All Decision Trees for Data Stream Classification , 2009, IEEE Transactions on Knowledge and Data Engineering.

[100]  Jerzy Stefanowski,et al.  Ensemble Classifiers for Imbalanced and Evolving Data Streams , 2018 .

[101]  Qing-Hui Wang,et al.  A rough set-based fault ranking prototype system for fault diagnosis , 2004, Eng. Appl. Artif. Intell..

[102]  Kristin P. Bennett,et al.  Duality and Geometry in SVM Classifiers , 2000, ICML.

[103]  Erdem Dilmen,et al.  A Novel Online LS-SVM Approach for Regression and Classification , 2017 .

[104]  Hadi Sadoghi Yazdi,et al.  Relaxed constraints support vector machine , 2012, Expert Syst. J. Knowl. Eng..

[105]  Diane J. Cook,et al.  One-Class Classification-Based Real-Time Activity Error Detection in Smart Homes , 2016, IEEE Journal of Selected Topics in Signal Processing.

[106]  Minghui Zhang,et al.  Data Stream Clustering and Outlier Detection Algorithm Based on Shared Nearest Neighbor Density , 2018, 2018 International Conference on Intelligent Transportation, Big Data & Smart City (ICITBS).

[107]  Xiaoli Li,et al.  Learning to Classify Texts Using Positive and Unlabeled Data , 2003, IJCAI.

[108]  Shai Shalev-Shwartz,et al.  Online learning: theory, algorithms and applications (למידה מקוונת.) , 2007 .

[109]  Petia Radeva,et al.  Approximate Convex Hulls Family for One-Class Classification , 2011, MCS.

[110]  Danny Hendler,et al.  Early detection of spamming accounts in large-Scale service provider networks , 2017, Knowl. Based Syst..

[111]  Jun Cao,et al.  A novel image segmentation approach for wood plate surface defect classification through convex optimization , 2017, Journal of Forestry Research.

[112]  M. V. Velzen,et al.  Self-organizing maps , 2007 .

[113]  Bartosz Krawczyk,et al.  One-class classifiers with incremental learning and forgetting for data streams with concept drift , 2015, Soft Comput..

[114]  David G. Stork,et al.  Pattern Classification , 1973 .