Recognition and learning of a class of context-sensitive languages described by augmented regular expressions

In this paper, a new formalism that permits to represent a non-trivial class of context-sensitive languages, the Augmented Regular Expressions (AREs), is introduced. AREs augment the expressive power of Regular Expressions (REs) by including a set of constraints that involve the number of instances in a string of the operands of the star operations of an RE. An efficient algorithm is given to recognize language strings by AREs. Also a general learning method to infer AREs from examples is presented, that consists of a regular grammatical inference step, a DFA to RE transformation, an RE parsing of the examples, and a constraint induction process.

[1]  King-Sun Fu,et al.  Inference for Transition Network Grammars , 1979, Comput. Lang..

[2]  E. Mark Gold,et al.  Language Identification in the Limit , 1967, Inf. Control..

[3]  G. Nagaraja,et al.  Inference of even linear grammars and its application to picture description languages , 1988, Pattern Recognit..

[4]  Horst Bunke,et al.  Syntactic and structural pattern recognition : theory and applications , 1990 .

[5]  Alberto Sanfeliu,et al.  Active Grammatical Inference: A New Learning Methodology , 1994 .

[6]  Arto Salomaa,et al.  Formal languages , 1973, Computer science classics.

[7]  O. Firschein,et al.  Syntactic pattern recognition and applications , 1983, Proceedings of the IEEE.

[8]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[9]  G. Z. Sun,et al.  Grammatical Inference , 1998, Lecture Notes in Computer Science.

[10]  Jens Gregor,et al.  Data-Driven Inductive Inference of Finite-State Automata , 1994, Int. J. Pattern Recognit. Artif. Intell..

[11]  J. Taylor,et al.  Switching and finite automata theory, 2nd ed. , 1980, Proceedings of the IEEE.

[12]  Yuji Takada,et al.  A Hierarchy of Language Families Learnable by Regular Language Learners , 1994, ICGI.

[13]  E. Mark Gold,et al.  Complexity of Automaton Identification from Given Data , 1978, Inf. Control..

[14]  Horst Bunke,et al.  Advances In Structural And Syntactic Pattern Recognition , 1993 .

[15]  Yuji Takada Grammatical Interface for Even Linear Languages Based on Control Sets , 1988, Inf. Process. Lett..

[16]  Eiichi Tanaka,et al.  Theoretical aspects of syntactic pattern recognition , 1995, Pattern Recognit..

[17]  Dana Angluin,et al.  Finding Patterns Common to a Set of Strings , 1980, J. Comput. Syst. Sci..

[18]  C. Lee Giles,et al.  Learning and Extracting Finite State Automata with Second-Order Recurrent Neural Networks , 1992, Neural Computation.

[19]  William A. Woods,et al.  Computational Linguistics Transition Network Grammars for Natural Language Analysis , 2022 .

[20]  Yasubumi Sakakibara,et al.  Efficient Learning of Context-Free Grammars from Positive Structural Examples , 1992, Inf. Comput..

[21]  Carl H. Smith,et al.  Inductive Inference: Theory and Methods , 1983, CSUR.

[22]  A.,et al.  INCREMENTAL GRAMMATICAL INFERENCE FROM POSITIVE ANDNEGATIVE DATA USING UNBIASED FINITE STATE AUTOMATA , 1994 .

[23]  Assaf Marron,et al.  Identification of Pattern Languages from Examples and Queries , 1987, Inf. Comput..

[24]  A. Sanfeliu,et al.  Augmented regular expressions: a formalism to describe, recognize, and learn a class of context-sensitive languages , 1995 .

[25]  Pedro García,et al.  IDENTIFYING REGULAR LANGUAGES IN POLYNOMIAL TIME , 1993 .

[26]  Alberto Sanfeliu,et al.  An Algebraic Framework to Represent Finite State Machines in Single-Layer Recurrent Neural Networks , 1995, Neural Computation.

[27]  Mineichi Kudo,et al.  Efficient regular grammatical inference techniques by the use of partial similarities and their logical relationships , 1988, Pattern Recognit..