Inference of Finite-State Probabilistic Grammars

The problem of the inference of finite-state probabilistic grammars is studied from two points of view. First, the theoretical aspects of grammatical inference are considered. Among the topics investigated are the structural and statistical properties of probabilistic grammars, methods for assigning probability measures to rewrite rules of probabilistic grammars, and statistical measures for determining how well an inferred probabilistic grammar approximates a sample set. The second concern of the study is the development and implementation of an algorithm for the inference of finite-state probabilistic grammars. This finite-state inference procedure produces a deterministic finite-state probabilistic grammar whose language approximates the sample set within a user-supplied acceptance region under the chi-square test. This procedure is enumerative. Heuristic tree-searching techniques are used to improve efficiency. The convergence of the procedure to an acceptable grammar is demonstrated and the steps of the procedure are theoretically justified. Test results of a PL/I implementation are presented. The inference procedure developed provides a means of synthesizing a probabilistic model of both physical and abstract systems from samples of their behavior.

[1]  Taylor L. Booth,et al.  Grammatical Inference: Introduction and Survey - Part I , 1975, IEEE Trans. Syst. Man Cybern..

[2]  John G. Kemeny,et al.  Finite Markov Chains. , 1960 .

[3]  Taylor L. Booth,et al.  Grammatical Inference: Introduction and Survey-Part I , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Jeffrey D. Ullman,et al.  Formal languages and their relation to automata , 1969, Addison-Wesley series in computer science and information processing.

[5]  Taylor L. Booth,et al.  Sequential machines and automata theory , 1967 .

[6]  Taylor L. Booth,et al.  Applying Probability Measures to Abstract Languages , 1973, IEEE Transactions on Computers.

[7]  Azaria Paz,et al.  Introduction to probabilistic automata (Computer science and applied mathematics) , 1971 .

[8]  Fred Joseph Maryanski Inference of probabilistic grammars. , 1974 .

[9]  Harry Charles Lee,et al.  Stochastic linguistics for picture recognition , 1972 .

[10]  James Jay Horning,et al.  A study of grammatical inference , 1969 .