SUper Team at SemEval-2016 Task 3: Building a Feature-Rich System for Community Question Answering

We present the system we built for participating in SemEval-2016 Task 3 on Community Question Answering. We achieved the best results on subtask C, and strong results on subtasks A and B, by combining a rich set of various types of features: semantic, lexical, metadata, and user-related. The most important group turned out to be the metadata for the question and for the comment, semantic vectors trained on QatarLiving data and similarities between the question and the comment for subtasks A and C, and between the original and the related question for Subtask B.

[1]  Caitlin Sadowski SimHash : Hash-based Similarity Detection , 2007 .

[2]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[3]  Preslav Nakov,et al.  Exposing Paid Opinion Manipulation Trolls , 2015, RANLP.

[4]  Preslav Nakov,et al.  Global Thread-level Inference for Comment Classification in Community Question Answering , 2015, EMNLP.

[5]  Preslav Nakov,et al.  Thread-Level Information for Comment Classification in Community Question Answering , 2015, ACL.

[6]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[7]  Preslav Nakov,et al.  SemanticZ at SemEval-2016 Task 3: Ranking Relevant Answers in Community Question Answering Using Semantic Similarity Based on Fine-tuned Word Embeddings , 2016, *SEMEVAL.

[8]  Preslav Nakov,et al.  PMI-cool at SemEval-2016 Task 3: Experiments with PMI and Goodness Polarity Lexicons for Community Question Answering , 2016, *SEMEVAL.

[9]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[10]  Preslav Nakov,et al.  Pairwise Neural Machine Translation Evaluation , 2015, ACL.

[11]  Preslav Nakov,et al.  MTE-NN at SemEval-2016 Task 3: Can Machine Translation Evaluation Help Community Question Answering? , 2016, *SEMEVAL.

[12]  Preslav Nakov,et al.  Hunting for Troll Comments in News Community Forums , 2016, ACL.

[13]  Preslav Nakov,et al.  SemEval-2015 Task 3: Answer Selection in Community Question Answering , 2015, *SEMEVAL.

[14]  Preslav Nakov,et al.  Joint Learning with Global Inference for Comment Classification in Community Question Answering , 2016, NAACL.

[15]  Yonatan Belinkov,et al.  VectorSLU: A Continuous Word Vector Approach to Answer Selection in Community Question Answering Systems , 2015, *SEMEVAL.

[16]  Kalina Bontcheva,et al.  GATE: an Architecture for Development of Robust HLT applications , 2002, ACL.

[17]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[18]  Preslav Nakov,et al.  Machine Translation Evaluation Meets Community Question Answering , 2016, ACL.

[19]  Preslav Nakov,et al.  QCRI: Answer Selection for Community Question Answering - Experiments for Arabic and English , 2015, *SEMEVAL.

[20]  Lucia Specia,et al.  Readability Assessment for Text Simplification , 2010 .

[21]  Dan Roth,et al.  Learning question classifiers: the role of semantic information , 2005, Natural Language Engineering.

[22]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[23]  Preslav Nakov,et al.  Finding Opinion Manipulation Trolls in News Community Forums , 2015, CoNLL.

[24]  Galia Angelova,et al.  Voltron: A Hybrid System For Answer Validation Based On Lexical And Distance Features , 2015, SemEval@NAACL-HLT.

[25]  Barbara Poblete,et al.  Predicting information credibility in time-sensitive social media , 2013, Internet Res..