Non-smooth classification model based on new smoothing technique

This work describes a framework for solving support vector machine with kernel (SVMK). Recently, it has been proved that the use of non-smooth loss function for supervised learning problem gives more efficient results [1]. This gives the idea of solving the SVMK problem based on hinge loss function. However, the hinge loss function is non-differentiable (we can’t use the standard optimization methods to minimize the empirical risk). To overcome this difficulty, a special smoothing technique for the hinge loss is proposed. Thus, the obtained smooth problem combined with Tikhonov regularization is solved using a stochastic gradient descent method. Finally, some numerical experiments on academic and real-life datasets are presented to show the efficiency of the proposed approach.

[1]  Piotr Jȩdrzejowicz,et al.  Gene Expression Programming Ensemble for Classifying Big Datasets , 2017, ICCCI.

[2]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[3]  Tomaso A. Poggio,et al.  Regularization Theory and Neural Networks Architectures , 1995, Neural Computation.

[4]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[5]  Hiroyuki Kasai SGDLibrary: A MATLAB library for stochastic optimization algorithms , 2017, J. Mach. Learn. Res..

[6]  Olvi L. Mangasarian,et al.  A class of smoothing functions for nonlinear and mixed complementarity problems , 1996, Comput. Optim. Appl..

[7]  Lorenzo Rosasco,et al.  Are Loss Functions All the Same? , 2004, Neural Computation.

[8]  David M. Blei,et al.  Stochastic Gradient Descent as Approximate Bayesian Inference , 2017, J. Mach. Learn. Res..

[9]  Sergei V. Pereverzev,et al.  Regularization Theory for Ill-Posed Problems: Selected Topics , 2013 .

[10]  J. Hadamard,et al.  Lectures on Cauchy's Problem in Linear Partial Differential Equations , 1924 .

[11]  Yuh-Jye Lee,et al.  epsilon-SSVR: A Smooth Support Vector Machine for epsilon-Insensitive Regression , 2005, IEEE Trans. Knowl. Data Eng..

[12]  Ersan YAZAN,et al.  Comparison of the stochastic gradient descent based optimization techniques , 2017, 2017 International Artificial Intelligence and Data Processing Symposium (IDAP).

[13]  Mark W. Schmidt,et al.  Fast Optimization Methods for L1 Regularization: A Comparative Study and Two New Approaches , 2007, ECML.

[14]  Wen Shen,et al.  Three-way decisions based blocking reduction models in hierarchical classification , 2020, Inf. Sci..

[15]  Suely Oliveira,et al.  Smoothed Hinge Loss and ℓ1 Support Vector Machines , 2018, 2018 IEEE International Conference on Data Mining Workshops (ICDMW).

[16]  Esra Kaya,et al.  Banknote Classification Using Artificial Neural Network Approach , 2016 .

[17]  Gorjan Alagic,et al.  #p , 2019, Quantum information & computation.

[18]  Mohamed Quafafou,et al.  Supervised learning as an inverse problem based on non-smooth loss function , 2020, Knowledge and Information Systems.

[19]  Bernhard Schölkopf,et al.  A Generalized Representer Theorem , 2001, COLT/EuroCOLT.

[20]  Mourad Nachaoui,et al.  Some novel numerical techniques for an inverse Cauchy problem , 2021, J. Comput. Appl. Math..