Multi-word expressions (MWEs) account for a large portion of the language used in day-to-day interactions. A formal system that is flexible enough to model these large and often syntactically-rich non-compositional chunks as single units in naturally occurring text could considerably simplify large-scale semantic annotation projects, in which it would be undesirable to have to develop internal compositional analyses of common technical expressions that have specific idiosyncratic meanings. This paper will first define a notion of functor-argument decomposition on phrase structure trees analogous to graph coloring, in which the tree is cast as a graph, and the elementary structures of a grammar formalism are colors. The paper then presents a formal argument that tree-rewriting systems, a class of grammar formalism that includes Tree Adjoining Grammars, are able to produce a proper superset of the functor-argument decompositions that string-rewriting systems can produce.
[1]
Joseph D. Becker.
The Phrasal Lexicon
,
1975,
TINLAP.
[2]
James Rogers.
Capturing CFLs with Tree Adjoining Grammars
,
1994,
ACL.
[3]
Aline Villavicencio,et al.
Lexical Encoding of MWEs
,
2004
.
[4]
Y. Bar-Hillel.
A Quasi-Arithmetical Notation for Syntactic Description
,
1953
.
[5]
Béatrice Godart-Wendling,et al.
Bar-Hillel, Yehoshua
,
2000
.
[6]
Timothy Baldwin,et al.
Multiword Expressions: A Pain in the Neck for NLP
,
2002,
CICLing.
[7]
Martin Kay,et al.
Syntactic Process
,
1979,
ACL.