Out-of-context noun phrase semantic interpretation with cross-linguistic evidence

The acquisition of semantic knowledge is paramount for any application that requires a deep understanding of natural language text. Motivated by the problem of building a noun phrase-level semantic parser and adapting it to various applications, such as machine translation and multilingual question answering, in this paper we present a domain-independent model for noun phrase semantic interpretation. We investigate the problem based on cross-linguistic evidence from a set of four Romance languages: Spanish, Italian, French, and Romanian. The focus on Romance languages is well motivated. It is generally the case that English noun phrases translate into constructions of the form "N P N" in Romance languages where, as we will show, the P (preposition) varies in ways that correlate with the semantics. Thus, based on a set of 22 semantic interpretation categories (such as PART-WHOLE, AGENT, POSSESSION) we present empirical observations regarding the distribution of these semantic categories in a cross-lingual corpus and their mapping to various syntactic constructions in English and Romance. Furthermore, given a training set of English noun phrases along with their translations in the four Romance languages, our algorithm automatically learns classification rules and applies them to unseen noun phrase instances for semantic interpretation. Experimental results are compared against a state-of-the-art model reported in the literature.

[1]  Alessandra Giorgi,et al.  The syntax of noun phrases , 1991 .

[2]  Mark Lauer,et al.  Corpus Statistics Meet the Noun Compound: Some Empirical Results , 1995, ACL.

[3]  Stan Szpakowicz,et al.  Semi-Automatic Recognition of Noun Modifier Relationships , 1998, ACL.

[4]  Martha Palmer,et al.  Class-Based Construction of a Verb Lexicon , 2000, AAAI/IAAI.

[5]  Barbara Rosario,et al.  Classifying the Semantic Relations in Noun Compounds via a Domain-Specific Lexical Hierarchy , 2001, EMNLP.

[6]  Alexandra Cornilescu,et al.  Romanian nominalizations: case and aspectual structure , 2001 .

[7]  Barbara Rosario,et al.  The Descent of Hierarchy, and Selection in Relational Semantics , 2002, ACL.

[8]  Dan Moldovan,et al.  Models for the Semantic Classification of Noun Phrases , 2004, HLT-NAACL 2004.

[9]  Rada Mihalcea,et al.  SenseLearner: Minimally supervised Word Sense Disambiguation for all words in open text , 2004, SENSEVAL@ACL.

[10]  Brian Young,et al.  The Cross-Breeding of Dictionaries , 2004, LREC.

[11]  Patrick Pantel,et al.  Automatically Labeling Semantic Classes , 2004, NAACL.

[12]  Frank Keller,et al.  The Web as a Baseline: Evaluating the Performance of Unsupervised Web-based Models for a Range of NLP Tasks , 2004, NAACL.

[13]  Dan I. Moldovan,et al.  On the semantics of noun compounds , 2005, Comput. Speech Lang..

[14]  Preslav Nakov,et al.  Search Engine Statistics Beyond the n-Gram: Application to Noun Compound Bracketing , 2005, CoNLL.

[15]  Dan I. Moldovan,et al.  A Semantic Scattering Model for the Automatic Interpretation of Genitives , 2005, HLT.

[16]  Patrick Pantel,et al.  Espresso: Leveraging Generic Patterns for Automatically Harvesting Semantic Relations , 2006, ACL.

[17]  Patrick Pantel,et al.  Ontologizing Semantic Relations , 2006, ACL.

[18]  Peter D. Turney Expressing Implicit Semantic Relations without Supervision , 2006, ACL.