From Partial toward Full Parsing

Full-Parsing systems able to analyze sentences robustly and completely at an appropriate accuracy can be useful in many computer applications like information retrieval and machine translation systems. Increasing the domain of locality by using tree-adjoining-grammars (TAG) caused some researchers to use it as a modeling formalism in their language application. But parsing with a rich grammar like TAG faces two main obstacles: low parsing speed and a lot of ambiguous syntactical parses. In order to decrease the parse time and these ambiguities, we use an idea of combining statistical chunker based on TAG formalism, with a heuristically rule-based search method to achieve the full parses. The partial parses induced from statistical chunker are basically resulted from a system named supertagger, and are followed by two different phases: error detection and error correction, which in each phase, different completion heuristics apply on the partial parses. The experiments on Penn Treebank show that by using a trained probability model considerable improvement in full-parsing rate is achieved.