Overcoming Incomplete Information in NLP Systems - Verb Subcategorization

A new methodology for overcoming incomplete information available for current natural language parsers will be presented in this paper. Although our aim is more ambitious, in this paper, we will focus on incomplete descriptions of the subcategorization classes of verbs and will sketch a proposal for overcoming the same problem for other syntactic categories. We assume a hierarchical multi-agent system architecture where each bottom-layer agent has a specialised knowledge (perspective) about the problems a given feature (e.g. verb subcategorization) of a syntactic category may have. Each agent has a declarative description of those problems and can find better solutions for the parsing problem once it has got an explanation for it. We are assuming logic based diagnosis agents. Each theoretically plausible hypothesis found must then be statistically validated. The pruning obtained and the ordering of validated hypothesis leads then to a learning problem that must be solved in order to enable a natural evolution of parsers (and their lexicons).