论文信息 - AN ABSTRACT OF THE DISSERTATION OF Guohua Hao for the degree of Doctor of Philosophy in Computer Science presented on July 21, 2009. Title: Efficient Training and Feature Induction in Sequential Supervised Learning Abstract approved:

AN ABSTRACT OF THE DISSERTATION OF Guohua Hao for the degree of Doctor of Philosophy in Computer Science presented on July 21, 2009. Title: Efficient Training and Feature Induction in Sequential Supervised Learning Abstract approved:

approved: Thomas G. Dietterich Sequential supervised learning problems arise in many real applications. This dissertation focuses on two important research directions in sequential supervised learning: efficient training and feature induction. In the direction of efficient training, we study the training of conditional random fields (CRFs), which provide a flexible and powerful model for sequential supervised learning problems. Existing training algorithms for CRFs are slow, particularly in problems with large numbers of potential input features and feature combinations. In this dissertation, we describe a new algorithm, TREECRF, for training CRFs via gradient tree boosting. In TREECRF, the CRF potential functions are represented as weighted sums of regression trees, which provide compact representations of feature interactions. So the algorithm does not explicitly consider the potentially large parameter space. As a result, gradient tree boosting scales linearly in the order of the Markov model and in the order of the feature interactions, rather than exponentially as in previous algorithms based on iterative scaling and gradient descent. Detailed experimental results are provided to evaluate the performance of the TREECRF algorithm and possible extensions of this algorithm are discussed. We also study the problem of handling missing input values in CRFs, which has been rarely discussed in the literature. Gradient tree boosting also makes it possible to use instance weighting (as in C4.5) and surrogate splitting (as in CART) to handle missing values in CRFs. Experimental studies of the effectiveness of these two methods (as well as standard imputation and indicator feature methods) show that instance weighting is the best method in most cases when feature values are missing at random. In the direction of feature induction, we study the search-based structured learning framework and its application to sequential supervised learning problems. By formulating the label sequence prediction process as an incremental search process from one end of a sequence to the other, this framework is able to avoid complicated inference algorithms in the training process and thus achieves very fast training speed. However, for problems where there exist long range dependencies between the current position and future positions, at each search step, this framework is unable to exploit these dependencies to make accurate predictions. In this dissertation, a multiple-instance learning based algorithm is proposed to automatically extract useful features from future positions as a way to discover and exploit these long range dependencies. Integrating this algorithm with maximum entropy Markov models yields promising experimental results on both synthetic data sets and real data sets that have long range dependencies in sequences. c ©Copyright by Guohua Hao July 21, 2009 All Rights Reserved Efficient Training and Feature Induction in Sequential Supervised Learning by Guohua Hao A DISSERTATION submitted to Oregon State University in partial fulfillment of the requirements for the degree of Doctor of Philosophy Presented July 21, 2009 Commencement June 2010 Doctor of Philosophy dissertation of Guohua Hao presented on July 21, 2009.

Thomas G. Dietterich | Guohua Hao

[1] Thomas G. Dietterich,et al. Gradient Tree Boosting for Training Conditional Random Fields , 2008 .

[2] Kotagiri Ramamohanarao,et al. Conditional Random Fields for Intrusion Detection , 2007, 21st International Conference on Advanced Information Networking and Applications Workshops (AINAW'07).

[3] Henry A. Kautz,et al. Training Conditional Random Fields Using Virtual Evidence Boosting , 2007, IJCAI.

[4] Alan Fern,et al. Discriminative Learning of Beam-Search Heuristics for Planning , 2007, IJCAI.

[5] Andrew McCallum,et al. An Introduction to Conditional Random Fields for Relational Learning , 2007 .

[6] Mark W. Schmidt,et al. Accelerated training of conditional random fields with stochastic gradient methods , 2006, ICML.

[7] Andrew McCallum,et al. Reducing Weight Undertraining in Structured Discriminative Learning , 2006, NAACL.

[8] Paul A. Viola,et al. Multiple Instance Boosting for Object Detection , 2005, NIPS.

[9] Mark Craven,et al. Supervised versus multiple instance learning: an empirical comparison , 2005, ICML.

[10] Ashwin Srinivasan,et al. Multi-instance tree learning , 2005, ICML.

[11] Daniel Marcu,et al. Learning as search optimization: approximate large margin methods for structured prediction , 2005, ICML.

[12] William W. Cohen,et al. Stacked Sequential Learning , 2005, IJCAI.

[13] Peter Auer,et al. A Boosting Approach to Multiple Instance Learning , 2004, ECML.

[14] Brian Roark,et al. Incremental Parsing with the Perceptron Algorithm , 2004, ACL.