Learning Context Free Grammars by Using SAT Solvers

In this paper, we propose a novel approach for learning context free grammars (CFGs) from positive and negative samples by solving a Boolean satisfiability problem (SAT). We encode the set of samples, together with limits on the sizes of rule sets to be synthesized as a Boolean expression. An assignment satisfying the Boolean expression contains a minimal set of rules that derives all positive samples and no negative samples. A feature of this approach is that we can synthesize the minimal set of rules in Chomsky normal form. The other feature is that our learning method reflects any improvements of SAT solvers. We present experimental results on learning CFGs for fundamental context free languages, including a set of strings composed of the equal numbers of a's and b's and the set of strings over {a, b}* not of the form ww.

[1]  Katsuhiko Nakamura,et al.  Towards Machine Learning of Grammars and Compilers of Programming Languages , 2008, ECML/PKDD.

[2]  Noam Chomsky Learning Context-free Grammars , .

[3]  Armin Biere,et al.  PicoSAT Essentials , 2008, J. Satisf. Boolean Model. Comput..

[4]  Adnan Darwiche,et al.  RSat 2.0: SAT Solver Description , 2006 .

[5]  Journal of the Association for Computing Machinery , 1961, Nature.

[6]  Niklas Sörensson,et al.  An Extensible SAT-solver , 2003, SAT.

[7]  Armin Biere,et al.  Symbolic Model Checking without BDDs , 1999, TACAS.

[8]  Martin Lange,et al.  Analyzing Context-Free Grammars Using an Incremental SAT Solver , 2008, ICALP.

[9]  Yasubumi Sakakibara,et al.  Learning Context-Free Grammars from Partially Structured Examples , 2000, ICGI.

[10]  Ilya Mironov,et al.  Applications of SAT Solvers to Cryptanalysis of Hash Functions , 2006, SAT.

[11]  Robert McNaughton,et al.  Parenthesis Grammars , 1967, JACM.

[12]  Stephen G. Pulman,et al.  Experiments in Inductive Chart Parsing , 1999, Learning Language in Logic.

[13]  Katsuhiko Nakamura,et al.  Incremental Learning of Context Free Grammars , 2002, ICGI.

[14]  Bart Selman,et al.  Local search strategies for satisfiability testing , 1993, Cliques, Coloring, and Satisfiability.

[15]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[16]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[17]  Joao Marques-Silva,et al.  GRASP-A new search algorithm for satisfiability , 1996, Proceedings of International Conference on Computer Aided Design.

[18]  Colin de la Higuera Grammatical Inference: Learning context-free grammars , 2010 .