Inspire at SemEval-2016 Task 2: Interpretable Semantic Textual Similarity Alignment based on Answer Set Programming

In this paper we present our system developed for the SemEval 2016 Task 2 Interpretable Semantic Textual Similarity along with the results obtained for our submitted runs. Our system participated in the subtasks predicting chunk similarity alignments for gold chunks as well as for predicted chunks. The Inspire system extends the basic ideas from last years participant NeRoSim, however we realize the rules in logic programming and obtain the result with an Answer Set Solver. To prepare the input for the logic program, we use the PunktTokenizer, Word2Vec, and WordNet APIs of NLTK, and the POSand NER-taggers from Stanford CoreNLP. For chunking we use a joint POS-tagger and dependency parser and based on that determine chunks with an Answer Set Program. Our system ranked third place overall and first place in the Headlines gold chunk subtask.

[1]  Thorsten Brants,et al.  One billion word benchmark for measuring progress in statistical language modeling , 2013, INTERSPEECH.

[2]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[3]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[4]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[5]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL.

[6]  Marius Thomas Lindauer,et al.  Potassco: The Potsdam Answer Set Solving Collection , 2011, AI Commun..

[7]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[8]  Joakim Nivre,et al.  Joint Morphological and Syntactic Analysis for Richly Inflected Languages , 2013, TACL.

[9]  Thomas Eiter,et al.  A model building framework for answer set programming with external computations* , 2015, Theory and Practice of Logic Programming.

[10]  Vasile Rus,et al.  NeRoSim: A System for Measuring and Interpreting Semantic Textual Similarity , 2015, *SEMEVAL.

[11]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[12]  Wolfgang Faber Answer Set Programming , 2013, Reasoning Web.

[13]  Eneko Agirre,et al.  SemEval-2016 Task 2: Interpretable Semantic Textual Similarity , 2016, *SEMEVAL.

[14]  Yuliya Lierler,et al.  Towards a Tight Integration of Syntactic Parsing with Semantic Disambiguation by means of Declarative Programming , 2013, IWCS.

[15]  Miroslaw Truszczynski,et al.  Answer set programming at a glance , 2011, Commun. ACM.

[16]  Steven Abney,et al.  Parsing By Chunks , 1991 .