A Structural Support Vector Method for Extracting Contexts and Answers of Questions from Online Forums

This paper addresses the issue of extracting contexts and answers of questions from post discussion of online forums. We propose a novel and unified model by customizing the structural Support Vector Machine method. Our customization has several attractive properties: (1) it gives a comprehensive graphical representation of thread discussion. (2) It designs special inference algorithms instead of general-purpose ones. (3) It can be readily extended to different task preferences by varying loss functions. Experimental results on a real data set show that our methods are both promising and flexible.

[1]  Thorsten Joachims,et al.  Training structural SVMs when exact inference is intractable , 2008, ICML '08.

[2]  Thorsten Joachims,et al.  Cutting-plane training of structural SVMs , 2009, Machine Learning.

[3]  Liang Zhou,et al.  Digesting Virtual "Geek" Culture: The Summarization of Technical Internet Relay Chats , 2005, ACL.

[4]  Owen Rambow,et al.  Summarizing Email Threads , 2004, NAACL.

[5]  Thomas Hofmann,et al.  Hierarchical document categorization with support vector machines , 2004, CIKM '04.

[6]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[7]  Yunsong Guo,et al.  Comparisons of sequence labeling algorithms and extensions , 2007, ICML '07.

[8]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[9]  Andrew McCallum,et al.  Collective Segmentation and Labeling of Distant Entities in Information Extraction , 2004 .

[10]  Jimmy J. Lin,et al.  Overview of the TREC 2007 Question Answering Track , 2008, TREC.

[11]  Young-In Song,et al.  Finding question-answer pairs from online forums , 2008, SIGIR '08.

[12]  Kathleen McKeown,et al.  Detection of Question-Answer Pairs in Email Conversations , 2004, COLING.

[13]  Xiaoyan Zhu,et al.  Using Conditional Random Fields to Extract Contexts and Answers of Questions from Online Forums , 2008, ACL.

[14]  Ben Taskar,et al.  Max-Margin Markov Networks , 2003, NIPS.

[15]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[16]  Hoa Trang Dang,et al.  Overview of the TREC 2006 Question Answering Track 99 , 2006, TREC.

[17]  Ming Zhou,et al.  Extracting Chatbot Knowledge from Online Discussion Forums , 2007, IJCAI.

[18]  W. Bruce Croft,et al.  Finding similar questions in large question and answer archives , 2005, CIKM '05.

[19]  Stephen Wan,et al.  Generating Overview Summaries of Ongoing Email Thread Discussions , 2004, COLING.

[20]  Wei-Ying Ma,et al.  2D Conditional Random Fields for Web information extraction , 2005, ICML.

[21]  Tat-Seng Chua,et al.  Question answering passage retrieval using dependency relations , 2005, SIGIR '05.

[22]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[23]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[24]  Ani Nenkova,et al.  Facilitating email thread access by extractive summary generation , 2003, RANLP.

[25]  Sanda M. Harabagiu,et al.  Methods for Using Textual Entailment in Open-Domain Question Answering , 2006, ACL.

[26]  Michel Galley,et al.  A Skip-Chain Conditional Random Field for Ranking Meeting Utterances by Importance , 2006, EMNLP.

[27]  Jihie Kim,et al.  An intelligent discussion-bot for answering student queries in threaded discussions , 2006, IUI '06.