论文信息 - Structure Regularization for Structured Prediction

Structure Regularization for Structured Prediction

While there are many studies on weight regularization, the study on structure regularization is rare. Many existing systems on structured prediction focus on increasing the level of structural dependencies within the model. However, this trend could have been misdirected, because our study suggests that complex structures are actually harmful to generalization ability in structured prediction. To control structure-based overfitting, we propose a structure regularization framework via structure decomposition, which decomposes training samples into mini-samples with simpler structures, deriving a model with better generalization power. We show both theoretically and empirically that structure regularization can effectively control overfitting risk and lead to better accuracy. As a by-product, the proposed method can also substantially accelerate the training speed. The method and the theoretical results can apply to general graphical models with arbitrary structures. Experiments on well-known tasks demonstrate that our method can easily beat the benchmark systems on those highly-competitive tasks, achieving record-breaking accuracies yet with substantially faster training speed.

Xu Sun | Xu Sun

[1] Trevor Darrell,et al. An efficient projection for {\it l}$_{\mbox{1}}$,$_{\mbox{infinity}}$ regularization , 2009, International Conference on Machine Learning.

[2] Xu Sun,et al. Structure Regularization for Structured Prediction: Theories and Experiments , 2014, ArXiv.

[3] Ben London,et al. PAC-Bayes Generalization Bounds for Randomized Structured Prediction , 2013 .

[4] André Elisseeff,et al. Stability and Generalization , 2002, J. Mach. Learn. Res..

[5] Dan Roth,et al. Efficient Decomposed Learning for Structured Prediction , 2012, ICML.

[6] Michael Collins,et al. Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[7] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[8] Qiang Yang,et al. Structural Regularized Support Vector Machine: A Framework for Structural Large Margin Classifier , 2011, IEEE Transactions on Neural Networks.

[9] Jun'ichi Tsujii,et al. Reranking for Biomedical Named-Entity Recognition , 2007, BioNLP@ACL.

[10] Andrew McCallum,et al. Piecewise pseudolikelihood for efficient training of conditional random fields , 2007, ICML '07.

[11] M. Yuan,et al. Model selection and estimation in regression with grouped variables , 2006 .