论文信息 - S-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking - 字舞流文

S-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking

Non-linear models recently receive a lot of attention as people are starting to discover the power of statistical and embedding features. However, tree-based models are seldom studied in the context of structured learning despite their recent success on various classification and ranking tasks. In this paper, we propose S-MART, a tree-based structured learning framework based on multiple additive regression trees. S-MART is especially suitable for handling tasks with dense features, and can be used to learn many different structures under various loss functions. We apply S-MART to the task of tweet entity linking — a core component of tweet information extraction, which aims to identify and link name mentions to entities in a knowledge base. A novel inference algorithm is proposed to handle the special structure of the task. The experimental results show that S-MART significantly outperforms state-of-the-art tweet entity linking systems.

Yi Yang | Ming-Wei Chang | Ming-Wei Chang | Yi Yang

[1] Aba-Sah Dadzie,et al. Making Sense of Microposts (#Microposts2014) Named Entity Extraction & Linking Challenge , 2014, #MSM.

[2] Quoc V. Le,et al. Learning to Rank with Nonsmooth Cost Functions , 2006, Neural Information Processing Systems.

[3] Qiang Wu,et al. Adapting boosting for information retrieval measures , 2010, Information Retrieval.

[4] Silviu Cucerzan,et al. Large-Scale Named Entity Disambiguation Based on Wikipedia Data , 2007, EMNLP.

[5] M. de Rijke,et al. Adding semantics to microblog posts , 2012, WSDM '12.

[6] Heng Ji,et al. Knowledge Base Population: Successful Approaches and Challenges , 2011, ACL.

[7] Michael Collins,et al. Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[8] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[9] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.

[10] K. Perez. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment , 2014 .

[11] J. Friedman. Greedy function approximation: A gradient boosting machine. , 2001 .

[12] Razvan C. Bunescu,et al. Using Encyclopedic Knowledge for Named entity Disambiguation , 2006, EACL.

[13] Andrew Y. Ng,et al. Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.

[14] B. Roe,et al. Boosted decision trees as an alternative to artificial neural networks for particle identification , 2004, physics/0408124.

[15] Oren Etzioni,et al. Named Entity Recognition in Tweets: An Experimental Study , 2011, EMNLP.

[16] Christopher Potts,et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[17] Mehryar Mohri,et al. Learning Ensembles of Structured Prediction Rules , 2014, ACL.

[18] Ben Taskar,et al. Efficient Second-Order Gradient Boosting for Conditional Random Fields , 2015, AISTATS.

[19] Ming-Wei Chang,et al. To Link or Not to Link? A Study on End-to-End Tweet Entity Linking , 2013, NAACL.

[20] Paolo Ferragina,et al. TAGME: on-the-fly annotation of short text fragments (by wikipedia entities) , 2010, CIKM.

[21] Kevin P. Murphy,et al. Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[22] Thomas Hofmann,et al. Support vector machine learning for interdependent and structured output spaces , 2004, ICML.

[23] Jianfeng Guo,et al. How Does Market Concern Derived from the Internet Affect Oil Prices? , 2013 .

[24] Thomas Hofmann,et al. Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[25] Ming-Wei Chang,et al. Entity Linking on Microblogs with Spatial and Temporal Signals , 2014, TACL.

[26] Thomas G. Dietterich,et al. Training conditional random fields via gradient tree boosting , 2004, ICML.

[27] Ben Taskar,et al. Max-Margin Markov Networks , 2003, NIPS.

[28] Ming-Hsuan Yang,et al. Robust Object Tracking with Online Multiple Instance Learning , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29] Nick Koudas,et al. TwitterMonitor: trend detection over the twitter stream , 2010, SIGMOD Conference.

[30] Yoshua Bengio,et al. Neural Probabilistic Language Models , 2006 .

[31] Ian H. Witten,et al. Learning to link with wikipedia , 2008, CIKM '08.

[32] Avirup Sil,et al. Re-ranking for joint named-entity recognition and linking , 2013, CIKM.

[33] Qiang Wu,et al. McRank: Learning to Rank Using Multiple Classification and Gradient Boosting , 2007, NIPS.

[34] William W. Cohen,et al. Semi-Markov Conditional Random Fields for Information Extraction , 2004, NIPS.

[35] Silviu Cucerzan. MSR System for Entity Linking at TAC 2012 , 2012, TAC.

[36] Heng Ji,et al. Overview of the TAC 2010 Knowledge Base Population Track , 2010 .

[37] Yitong Li,et al. Entity Linking for Tweets , 2013, ACL.

[38] Bernardo A. Huberman,et al. Predicting the Future with Social Media , 2010, Web Intelligence.