论文信息 - Laplace maximum margin Markov networks

Laplace maximum margin Markov networks

We propose Laplace max-margin Markov networks (LapM3N), and a general class of Bayesian M3N (BM3N) of which the LapM3N is a special case with sparse structural bias, for robust structured prediction. BM3N generalizes extant structured prediction rules based on point estimator to a Bayes-predictor using a learnt distribution of rules. We present a novel Structured Maximum Entropy Discrimination (SMED) formalism for combining Bayesian and max-margin learning of Markov networks for structured prediction, and our approach subsumes the conventional M3N as a special case. An efficient learning algorithm based on variational inference and standard convex-optimization solvers for M3N, and a generalization bound are offered. Our method outperforms competing ones on both synthetic and real OCR data.

Bo Zhang | Eric P. Xing | Jun Zhu

[1] Tommi S. Jaakkola,et al. Maximum Entropy Discrimination , 1999, NIPS.

[2] John D. Lafferty,et al. Boosting and Maximum Likelihood for Exponential Models , 2001, NIPS.

[3] Thomas Hofmann,et al. Hidden Markov Support Vector Machines , 2003, ICML.

[4] Ben Taskar,et al. Max-Margin Markov Networks , 2003, NIPS.

[5] Xavier Carreras,et al. Exponentiated gradient algorithms for log-linear structured prediction , 2007, ICML '07.

[6] Nathan Ratliff,et al. Online) Subgradient Methods for Structured Prediction , 2007 .

[7] Martin J. Wainwright,et al. High-Dimensional Graphical Model Selection Using ℓ1-Regularized Logistic Regression , 2006, NIPS.

[8] B. Schölkopf,et al. High-Dimensional Graphical Model Selection Using ℓ1-Regularized Logistic Regression , 2007 .

[9] Ben Taskar,et al. Exponentiated Gradient Algorithms for Large-margin Structured Classification , 2004, NIPS.

[10] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[11] Mário A. T. Figueiredo. Adaptive Sparseness for Supervised Learning , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[12] Alexander J. Smola,et al. Support vector machine learning , 2001, Tutorial Guide. ISCAS 2001. IEEE International Symposium on Circuits and Systems (Cat. No.01TH8573).

[13] Jianfeng Gao,et al. Scalable training of L1-regularized log-linear models , 2007, ICML '07.

[14] John Langford,et al. An Improved Predictive Accuracy Bound for Averaging Classifiers , 2001, ICML.

[15] O. Mangasarian,et al. Robust linear programming discrimination of two linearly inseparable sets , 1992 .

[16] Miroslav Dudík,et al. Maximum Entropy Density Estimation with Generalized Regularization and an Application to Species Distribution Modeling , 2007, J. Mach. Learn. Res..

[17] Nuno Vasconcelos,et al. Direct convex relaxations of sparse SVM , 2007, ICML '07.

[18] Yuan Qi,et al. Bayesian Conditional Random Fields , 2005, AISTATS.

[19] Daphne Koller,et al. Efficient Structure Learning of Markov Networks using L1-Regularization , 2006, NIPS.

[20] Thomas Hofmann,et al. Support vector machine learning for interdependent and structured output spaces , 2004, ICML.

[21] Ben Taskar,et al. Structured Prediction via the Extragradient Method , 2005, NIPS.

[22] Ata Kabán,et al. On Bayesian classification with Laplace priors , 2007, Pattern Recognit. Lett..

[23] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .