论文信息 - Automatic test suite generation for PMCFG grammars

Automatic test suite generation for PMCFG grammars

We present a method for finding errors in formalized natural language grammars, by automatically and systematically generating test cases that are intended to be judged by a human oracle. The method works on a per-construction basis; given a construction from the grammar, it generates a finite but complete set of test sentences (typically tens or hundreds), where that construction is used in all possible ways. Our method is an alternative to using a corpus or a treebank, where no such completeness guarantees can be made. The method is language-independent and is implemented for the grammar formalism PMCFG, but also works for weaker grammar formalisms. We evaluate the method on a number of different grammars for different natural languages, with sizes ranging from toy examples to real-world grammars.

Koen Claessen | Inari Listenmaa

[1] Meng Wang,et al. Feat: functional enumeration of algebraic types , 2012, Haskell.

[2] Mark Steedman,et al. Combinators and Grammars , 1988 .

[3] John J. Camilleri. Contracts and Computation — Formal modelling and analysis for normative natural language , 2017 .

[4] Aarne Ranta,et al. Grammatical Framework , 2004, Journal of Functional Programming.

[5] Aarne Ranta,et al. The GF Resource Grammar Library , 2009 .

[6] Tadao Kasami,et al. On Multiple Context-Free Grammars , 1991, Theor. Comput. Sci..

[7] O. Caprotti. WebALT! Deliver Mathematics Everywhere , 2006 .

[8] Anna Freud,et al. Grammatical Framework Programming With Multilingual Grammars , 2016 .

[9] Leonardo Mendonça de Moura,et al. Generating efficient test sets with a model checker , 2004, Proceedings of the Second International Conference on Software Engineering and Formal Methods, 2004. SEFM 2004..

[10] Aarne Ranta,et al. Controlled Language for Everyday Use: The MOLTO Phrasebook , 2010, CNL.

[11] Edward P. Stabler,et al. Derivational Minimalism , 1996, LACL.

[12] Shashi Narayan,et al. Error Mining on Dependency Trees , 2012, ACL.

[13] Aarne Ranta,et al. An Open-Source Computational Grammar for Romanian , 2010, CICLing.