How to loose confidence: probabilistic linear machines for multiclass classification

In this paper we propose a novel multiclass classifier called the probabilistic linear machine (PLM) which overcomes the low-entropy problem of exponential-based classifiers. Although PLMs are linear classifiers, we use a careful design of the parameters matched with weak requirements over the features to output a true probability distribution over labels given an input instance. We cast the discriminative learning problem as linear programming, which can scale up to large problems on the order of millions of training samples. Our experiments on phonetic classification show that PLM achieves high entropy while maintaining a comparable accuracy to other state-of-theart classifiers.

[1]  P. Bartlett,et al.  Probabilities for SV Machines , 2000 .

[2]  Suvrit Sra,et al.  Efficient Large Scale Linear Programming Support Vector Machines , 2006, ECML.

[3]  Rich Caruana,et al.  Obtaining Calibrated Probabilities from Boosting , 2005, UAI.

[4]  Robert Tibshirani,et al.  1-norm Support Vector Machines , 2003, NIPS.

[5]  Koby Crammer,et al.  Discriminative Learning via Semidefinite Probabilistic Models , 2006, UAI.

[6]  Nello Cristianini,et al.  Large Margin DAGs for Multiclass Classification , 1999, NIPS.

[7]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[8]  Jeff A. Bilmes,et al.  Ratio semi-definite classifiers , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[9]  Richard Wright,et al.  The Vocal Joystick , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[10]  Claudio Gentile,et al.  Margin-Based Algorithms for Information Filtering , 2002, NIPS.

[11]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[12]  O. Mangasarian,et al.  Massive data discrimination via linear support vector machines , 2000 .

[13]  Richard Wright,et al.  The vocal joystick data collection effort and vowel corpus , 2006, INTERSPEECH.

[14]  Koby Crammer,et al.  On the Algorithmic Implementation of Multiclass Kernel-based Vector Machines , 2002, J. Mach. Learn. Res..