Parsing Mildly Context-sensitive RMS
暂无分享,去创建一个
We introduce Recursive Matrix Systems (RMS) which encompass mildly context-sensitive for malisms and present efficient parsing algorithms for linear and context-free variants of RMS. The time complexities are O(n2h+I ), and O(n3h ) respectively, where h is the height of the matrix. It is possible to represent Tree Adjoining Grammars (TAG [1], MC-TAG [2], and R-TAG [3]) as RMS uniformly. 1 Recursive Matrix Systems (RMS) RMS = (G, I) is a two-step formalism. In a first step, a grammar G = (N, T, S, P) generates a set of recursive matrices. In a second step, a yield function maps, according to the interpretation I, the recursive matrices to a set of strings L(G, I) . The recursive matrix generating grammars are ordinary phrase structure grammars, where the terminal symbols are replaced by vertical vectors. These vertical vectors are filled with terminal symbols, nonterminal symbols, or they are left empty. derive ield The two-step formalism: S E N ==} 0 m E M HI w E T* . Example: G1 = (N, T, S, P) = ({S} , {a, b, c} , S, {S ➔ , � I S , S --+ I � I }) . Every time the first rule is applied, a new column with terminals is added to the matrix. The last rule terminates the process. A possible derivation: S =?h, I : I S =?h, I : II: i s =?h, I ! II: II H The product of the derivation process is a matrix with terminals as elements. Finally, these terminal symbols are combined into one string. There are many possible interpretation functions. However, it seems reasonable to read the terminal symbols within the matrix row by row from top to bottom, and each row alternating from left to right and from right to left. This interpretation condition for height 3 could be visualized by ; . The grammar G1 together with this interpretation generates strings of the form aaa . . . bbb . . . ccc . . . . Thus, the generated language is L(G1 , ; ) = {anbncn l n E N} Definition: h is a positive integer. A recursive matrix grammar is a four-tuple G = (N, T, S, P), where N and T are disjoint alphabets , S E N, P is a finite set of ordered pairs (u, v) E N x (N U Ch)* ,
[1] Tilman Becker,et al. Recursive Matrix Systems (RMS) and TAG , 1998, TAG+.
[2] Giorgio Satta,et al. Restrictions on Tree Adjoining Languages , 1998, ACL.
[3] Klaas Sikkel,et al. Parsing of Context-Free Languages , 1997, Handbook of Formal Languages.