Optimizing the Computational Lexicalization of Large Grammars

The computational lexicalization of a grammar is the optimization of the links between lexicalized rules and lexical items in order to improve the quality of the bottom-up filtering during parsing. This problem is NP-complete and untractable on large grammars. An approximation algorithm is presented. The quality of the suboptimal solution is evaluated on real-world grammars as well as on randomly generated ones.