Conventional wisdom dictates that synchronous context-free grammars (SCFGs) must be converted to Chomsky Normal Form (CNF) to ensure cubic time decoding. For arbitrary SCFGs, this is typically accomplished via the synchronous binarization technique of (Zhang et al., 2006). A drawback to this approach is that it inflates the constant factors associated with decoding, and thus the practical running time. (DeNero et al., 2009) tackle this problem by defining a superset of CNF called Lexical Normal Form (LNF), which also supports cubic time decoding under certain implicit assumptions. In this paper, we make these assumptions explicit, and in doing so, show that LNF can be further expanded to a broader class of grammars (called "scope-3") that also supports cubic-time decoding. By simply pruning non-scope-3 rules from a GHKM-extracted grammar, we obtain better translation performance than synchronous binarization.
[1]
David Chiang,et al.
Hierarchical Phrase-Based Translation
,
2007,
CL.
[2]
Salim Roukos,et al.
Bleu: a Method for Automatic Evaluation of Machine Translation
,
2002,
ACL.
[3]
Daniel Gildea,et al.
Synchronous Binarization for Machine Translation
,
2006,
NAACL.
[4]
Daniel H. Younger,et al.
Recognition and Parsing of Context-Free Languages in Time n^3
,
1967,
Inf. Control..
[5]
John DeNero,et al.
Efficient Parsing for Transducer Grammars
,
2009,
HLT-NAACL.
[6]
David Chiang,et al.
Forest Rescoring: Faster Decoding with Integrated Language Models
,
2007,
ACL.
[7]
Mark Hopkins,et al.
Cube Pruning as Heuristic Search
,
2009,
EMNLP.
[8]
Daniel Marcu,et al.
What’s in a translation rule?
,
2004,
NAACL.