Modified Logistic Regression: An Approximation to SVM and Its Applications in Large-Scale Text Categorization

Logistic Regression (LR) has been widely used in statistics for many years, and has received extensive study in machine learning community recently due to its close relations to Support Vector Machines (SVM) and AdaBoost. In this paper, we use a modified version of LR to approximate the optimization of SVM by a sequence of unconstrained optimization problems. We prove that our approximation will converge to SVM, and propose an iterative algorithm called "MLR-CG" which uses Conjugate Gradient as its inner loop. Multiclass version "MMLR-CG" is also obtained after simple modifications. We compare the MLR-CG with SVMlight over different text categorization collections, and show that our algorithm is much more efficient than SVMlight when the number of training examples is very large. Results of the multiclass version MMLR-CG is also reported.

[1]  David G. Luenberger,et al.  Introduction to Linear and Nonlinear Programming , 1973 .

[2]  E. Polak Introduction to linear and nonlinear programming , 1973 .

[3]  丸山 徹 Convex Analysisの二,三の進展について , 1977 .

[4]  J. Dussault,et al.  Stable exponential-penalty algorithm with superlinear convergence , 1994 .

[5]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[6]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[7]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[8]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[9]  David J. Crisp,et al.  Uniqueness of the SVM Solution , 1999, NIPS.

[10]  Yiming Yang,et al.  A re-examination of text categorization methods , 1999, SIGIR '99.

[11]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[12]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory, Second Edition , 2000, Statistics for Engineering and Information Science.

[13]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[14]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[15]  OneClass GunnarRätsch,et al.  SVM and Boosting : One Class , 2000 .

[16]  John D. Lafferty,et al.  Boosting and Maximum Likelihood for Exponential Models , 2001, NIPS.

[17]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[18]  Thomas P. Minka,et al.  Algorithms for maximum-likelihood logistic regression , 2003 .

[19]  Yiming Yang,et al.  A Study of Approaches to Hypertext Categorization , 2002, Journal of Intelligent Information Systems.

[20]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[21]  Yoram Singer,et al.  Logistic Regression, AdaBoost and Bregman Distances , 2000, Machine Learning.

[22]  Tong Zhang,et al.  Text Categorization Based on Regularized Linear Classification Methods , 2001, Information Retrieval.