Maximal Uncorrelated Multinomial Logistic Regression

Multinomial logistic regression (MLR) has been widely used in the field of face recognition, text classification, and so on. However, the standard multinomial logistic regression has not yet stressed the problem of data redundancy. That is to say, in multi-class classification, there are many similar features among different classes, which will cause the corresponding classes not being correctly classified. As data redundancy is a common phenomenon in many fields, in response to this phenomenon, this paper proposes a maximal uncorrelated MLR (MUMLR) classification model to solve the problem of data redundancy in multi-class classification. The main idea is to reduce the weight of similar features and try to keep more discriminative information in the data by adding an uncorrelated regularization. In addition, we use the Cauchy–Buniakowsky–Schwarz inequation to scale the original objective function into the convex function and solve it by the Adam optimization method. Its main advantages are as follows: for data with more redundant information, the classification effect of the proposed algorithm is better than the state-of-the-art algorithms. In addition, we prove that the regularization we proposed can also be applied to neural networks and has achieved good results.

[1]  Christian Igel,et al.  A Unified View on Multi-class Support Vector Classification , 2016, J. Mach. Learn. Res..

[2]  Xuekun Song,et al.  Grouped gene selection and multi-classification of acute leukemia via new regularized multinomial regression. , 2018, Gene.

[3]  Sujuan Gao,et al.  A Solution to Separation and Multicollinearity in Multiple Logistic Regression. , 2008, Journal of data science : JDS.

[4]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[5]  José M. Bioucas-Dias,et al.  Fast Sparse Multinomial Regression Applied to Hyperspectral Data , 2006, ICIAR.

[6]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[7]  Francisco Herrera,et al.  An overview of ensemble methods for binary classifiers in multi-class problems: Experimental study on one-vs-one and one-vs-all schemes , 2011, Pattern Recognit..

[8]  Yanyan Wang,et al.  Adaptive multinomial regression with overlapping groups for multi-class classification of lung cancer , 2018, Comput. Biol. Medicine.

[9]  Andrea De Mauro,et al.  A formal definition of Big Data based on its essential features , 2016 .

[10]  Muhammad Hisyam Lee,et al.  Penalized logistic regression with the adaptive LASSO for gene selection in high-dimensional cancer classification , 2015, Expert Syst. Appl..

[11]  Jorge Nocedal,et al.  A Stochastic Quasi-Newton Method for Large-Scale Optimization , 2014, SIAM J. Optim..

[12]  Martin Zinkevich,et al.  Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[13]  Balaji Rajagopalan,et al.  Analyses of global sea surface temperature 1856–1991 , 1998 .

[14]  Junbin Gao,et al.  Robust Multinomial Logistic Regression Based on RPCA , 2018, IEEE Journal of Selected Topics in Signal Processing.

[15]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[16]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[17]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[18]  S. Cessie,et al.  Ridge Estimators in Logistic Regression , 1992 .

[19]  Bojan Furlan,et al.  Semantic similarity of short texts in languages with a deficient natural language processing support , 2013, Decis. Support Syst..

[20]  Gerhard Weikum,et al.  Fast logistic regression for text categorization with variable-length n-grams , 2008, KDD.

[21]  Seyed Amir Naghibi,et al.  A comparative study of landslide susceptibility maps produced using support vector machine with different kernel functions and entropy data mining models in China , 2018, Bulletin of Engineering Geology and the Environment.

[22]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[23]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[24]  Jelle J Goeman,et al.  A Goodness‐of‐Fit Test for Multinomial Logistic Regression , 2006, Biometrics.

[25]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[26]  B. Scholkopf,et al.  Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[27]  Maha N. Hajmeer,et al.  Comparison of logistic regression and neural network-based classifiers for bacterial growth , 2003 .

[28]  Sameer A. Nene,et al.  Columbia Object Image Library (COIL100) , 1996 .

[29]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[30]  Yiming Yang,et al.  Modified Logistic Regression: An Approximation to SVM and Its Applications in Large-Scale Text Categorization , 2003, ICML.