Introducing Domain and Typing Bias in Automata Inference

Grammatical inference consists in learning formal grammars for unknown languages when given sequential learning data. Classically this data is raw: Strings that belong to the language and eventually strings that do not. In this paper, we present a generic setting allowing to express domain and typing background knowledge. Algorithmic solutions are provided to introduce this additional information efficiently in the classical state-merging automata learning framework. Improvement induced by the use of this background knowledge is shown on both artificial and real data.

[1]  DANA ANGLUIN,et al.  On the Complexity of Minimum Inference of Regular Sets , 1978, Inf. Control..

[2]  Dana Angluin,et al.  Learning Regular Sets from Queries and Counterexamples , 1987, Inf. Comput..

[3]  Pedro García,et al.  IDENTIFYING REGULAR LANGUAGES IN POLYNOMIAL TIME , 1993 .

[4]  Horst Bunke,et al.  Advances In Structural And Syntactic Pattern Recognition , 1993 .

[5]  Rafael C. Carrasco,et al.  Grammatical Inference and Applications , 1994, Lecture Notes in Computer Science.

[6]  Enrique Vidal,et al.  What Is the Search Space of the Regular Inference? , 1994, ICGI.

[7]  José Oncina,et al.  Learning Stochastic Regular Grammars by Means of a State Merging Method , 1994, ICGI.

[8]  José Oncina,et al.  Using domain information during the learning of a subsequential transducer , 1996, ICGI.

[9]  Colin de la Higuera Characteristic Sets for Polynomial Grammatical Inference , 1997 .

[10]  Daniel Fredouille,et al.  Efficient Ambiguity Detection in C-NFA, a Step Towards the Inference on Non Deterministic Automata , 2000, ICGI.

[11]  Yasubumi Sakakibara,et al.  Learning Context-Free Grammars from Partially Structured Examples , 2000, ICGI.

[12]  Aurélien Lemay,et al.  Learning Regular Languages Using RFSA , 2001, ALT.

[13]  Pedro García,et al.  Inferring Subclasses of Regular Languages Faster Using RPNI and Forbidden Configurations , 2002, ICGI.

[14]  Oren Etzioni,et al.  A Grammar Inference Algorithm for the World Wide Web , 2002 .

[15]  Colin de la Higuera,et al.  Learning Languages with Help , 2002, ICGI.

[16]  Hendrik Blockeel,et al.  Machine Learning: ECML 2003 , 2003, Lecture Notes in Computer Science.

[17]  Daniel Fredouille,et al.  Unambiguous Automata Inference by Means of State-Merging Methods , 2003, ECML.

[18]  Aurélien Lemay,et al.  Learning regular languages using RFSAs , 2004, Theor. Comput. Sci..

[19]  Colin de la Higuera,et al.  Improving Probabilistic Automata Learning with Additional Knowledge , 2004, SSPR/SPR.