Alignment Based Similarity Measure for Grammar Learning

We introduce a similarity measure, called alignment profile similarity, for the learning of context-free grammar from given language samples. Based on the alignment profile similarity, an alignment learning framework for grammatical inference is proposed. Alignment profile similarity is used to improve alignments, and therefore increase the quality of the rules identified. The experiments show that the proposed methods improve performance in terms of the percentage of correctly generated grammar rules.

[1]  Takashi Yokomori,et al.  Learning non-deterministic finite automata from queries and counterexamples , 1994, Machine Intelligence 13.

[2]  Simon M. Lucas,et al.  A Comparison of Syntactic and Statistical Techniques for Off-Line OCR , 1994, ICGI.

[3]  Skander Soltani,et al.  On the use of the wavelet decomposition for time series prediction , 2002, ESANN.

[4]  Rajesh Parekh,et al.  A Polynominal Time Incremental Algorithm for Learning DFA , 1998, ICGI.

[5]  Hans-Jürgen Zimmermann,et al.  PII: S0165-0114(98)00337-6 , 2003 .

[6]  D. Angluin Negative Results for Equivalence Queries , 1990, Machine Learning.

[7]  Enrique Vidal,et al.  What Is the Search Space of the Regular Inference? , 1994, ICGI.

[8]  Barak A. Pearlmutter,et al.  Results of the Abbadingo One DFA Learning Competition and a New Evidence-Driven State Merging Algorithm , 1998, ICGI.

[9]  Taylor L. Booth,et al.  Grammatical Inference: Introduction and Survey - Part I , 1975, IEEE Trans. Syst. Man Cybern..

[10]  E. Carterette,et al.  Redundancy in children's free-reading choices , 1963 .

[11]  Kevin J. Lang Random DFA's can be approximately learned from sparse uniform examples , 1992, COLT '92.

[12]  Rajesh Parekh,et al.  An Incremental Interactive Algorithm for Regular Grammar Inference , 1996, AAAI/IAAI, Vol. 2.

[13]  Pierre Dupont,et al.  Incremental regular inference , 1996, ICGI.

[14]  János Abonyi,et al.  Learning fuzzy classification rules from labeled data , 2003, Inf. Sci..

[15]  Ian H. Witten,et al.  Identifying Hierarchical Structure in Sequences: A linear-time algorithm , 1997, J. Artif. Intell. Res..

[16]  Frank Klawonn,et al.  Learning indistinguishability from data , 2002, Soft Comput..

[17]  E. Carterette,et al.  Informal speech : alphabetic & phonemic texts with statistical analyses and tables , 1974 .

[18]  Esko Ukkonen,et al.  Pattern Discovery in Biosequences , 1998, ICGI.

[19]  Mehryar Mohri,et al.  Finite-State Transducers in Language and Speech Processing , 1997, CL.

[20]  Lotfi A. Zadeh,et al.  Fuzzy Sets , 1996, Inf. Control..

[21]  Uzay Kaymak,et al.  Similarity measures in fuzzy rule base simplification , 1998, IEEE Trans. Syst. Man Cybern. Part B.

[22]  DANA ANGLUIN,et al.  On the Complexity of Minimum Inference of Regular Sets , 1978, Inf. Control..

[23]  Rajesh Parekh,et al.  An incremental interactive algorithm for grammar inference , 1996 .

[24]  Dana Angluin,et al.  Learning Regular Sets from Queries and Counterexamples , 1987, Inf. Comput..

[25]  L. Bliss Modal usage by preschool children , 1988 .

[26]  Pieter W. Adriaans,et al.  The EMILE 4.1 Grammar Induction Toolbox , 2002, ICGI.

[27]  Raimondo Schettini,et al.  Fuzzy reasoning approach to similarity evaluation in image analysis , 1993, Int. J. Intell. Syst..

[28]  Joao Marques-Silva,et al.  Efficient Algorithms for the Inference of Minimum Size DFAs , 2001, Machine Learning.

[29]  Dana Angluin,et al.  Inductive Inference of Formal Languages from Positive Data , 1980, Inf. Control..

[30]  C. Chang The development of autonomy in preschool mandarin chinese-speaking children's play narratives , 1998 .

[31]  Juan Ramón Rico-Juan,et al.  Stochastic k-testable Tree Languages and Applications , 2002, ICGI.

[32]  Yasubumi Sakakibara,et al.  Learning context-free grammars from structural data in polynomial time , 1988, COLT '88.

[33]  E. Mark Gold,et al.  Complexity of Automaton Identification from Given Data , 1978, Inf. Control..

[34]  José-Miguel Benedí,et al.  RNA Modeling by Combining Stochastic Context-Free Grammars and n-Gram Models , 2002, Int. J. Pattern Recognit. Artif. Intell..

[35]  Dominique Estival,et al.  Theoretical and Practical Experiences with Alignment-Based Learning , 2003 .

[36]  Mehryar Mohri,et al.  The Design Principles of a Weighted Finite-State Transducer Library , 2000, Theor. Comput. Sci..

[37]  Menno van Zaanen,et al.  Bootstrapping structure into language : alignment-based learning , 2001, ArXiv.

[38]  Dana Angluin,et al.  When won't membership queries help? , 1991, STOC '91.

[39]  E. Turiel The Development of Morality , 2007 .

[40]  Laurent Miclet,et al.  Structural Methods in Pattern Recognition , 1986 .