Online Nonlinear Classification for High-Dimensional Data

We study online binary classification problem under the empirical zero-one loss function. We introduce a novel randomized classification algorithm based on highly dynamic hierarchical models that partition the feature space. Our approach jointly and sequentially learns the partitioning of the feature space, the optimal classifier among all doubly exponential number of classifiers defined by the tree, and the individual region classifiers in order to directly minimize the cumulative loss. Although we adapt the entire hierarchical model to minimize a global loss function, the computational complexity of the introduced algorithm scales linearly with the dimensionality of the feature space and the depth of the tree. Furthermore, our algorithm can be applied to any streaming data without requiring a training phase or prior information, hence processes data on-the-fly and then discards it, which makes the introduced algorithm significantly appealing for applications involving "big data". We evaluate the performance of the introduced algorithm over different real data sets.

[1]  François Fouss,et al.  Continually Learning Optimal Allocations of Services to Tasks , 2008, IEEE Transactions on Services Computing.

[2]  Horst Bischof,et al.  On robustness of on-line boosting - a competitive study , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[3]  Yoav Freund,et al.  Large Margin Classification Using the Perceptron Algorithm , 1998, COLT.

[4]  Venkatesh Saligrama,et al.  Local Supervised Learning through Space Partitioning , 2012, NIPS.

[5]  David M. Nicol,et al.  Knowledge Discovery from Big Data for Intrusion Detection Using LDA , 2014, 2014 IEEE International Congress on Big Data.

[6]  Hsuan-Tien Lin,et al.  An Online Boosting Algorithm with Theoretical Justifications , 2012, ICML.

[7]  Piyush Malik,et al.  Governing Big Data: Principles and practices , 2013, IBM J. Res. Dev..

[8]  Yan Li,et al.  Pedestrian Analysis and Counting System with Videos , 2012, ICONIP.

[9]  Yang Song,et al.  Hierarchical Online Problem Classification for IT Support Services , 2012, IEEE Transactions on Services Computing.

[10]  Stuart J. Russell,et al.  Online bagging and boosting , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[11]  Suleyman Serdar Kozat,et al.  A Comprehensive Approach to Universal Piecewise Nonlinear Regression Based on Trees , 2013, IEEE Transactions on Signal Processing.