Breaking parsers: mutation-based generation of programs with guaranteed syntax errors

Grammar-based test case generation has focused almost exclusively on generating syntactically correct programs (i.e., positive tests) from a context-free reference grammar but a positive test suite cannot detect when the unit under test accepts words outside the language (i.e., false positives). Here, we investigate the converse problem and describe two mutation-based approaches for generating programs with guaranteed syntax errors (i.e., negative tests). % Word mutation systematically modifies positive tests by deleting, inserting, substituting, and transposing tokens in such a way that at least one impossible token pair emerges. % Rule mutation applies such operations to the symbols of the right-hand sides of productions in such a way that each derivation that uses the mutated rule yields a word outside the language.