Incremental Learning of Context Free Grammars by Extended Inductive CYK Algorithm

This paper describes recent improvements in Synapse system [5, 6] for inductive inference of context free grammars from sample strings. For effective inference of grammars, Synapse employs incremental learning based on the rule generation mechanism called inductive CYK algorithm, which generates the minimum production rules required for parsing positive samples. In the improved version, the form of production rules is extended to include not only A → βγ but also A → β, called extended Chomsky normal form, where each of β and γ is either terminal or nonterminal symbol. By this extension and other improvements, Synapse can synthesize both ambiguous grammars and unambiguous grammars with less computation time compared to the previous system.