Design of a multi-lingual, parallel-processing statistical parsing engine

Ever since the widespread availability of the Penn Treebank [9], there have been numerous, statistical parsers developed for English, e.g. [8, 5, 3]. To varying degrees, these parsers and others---while very successful at the tasks for which they were designed---had the following limitations: • they had a fairly fixed probabilistic structure, which could only be changed by re-coding some significant portion of the program • they had hard-coded features specific to English • they had hard-coded features specific to the Penn Treebank • they were designed only for a uniprocessor environment