ASIUM: Learning subcategorization frames and restrictions of se-18 lection

We describe in this paper the ML system, Asium, which learns subcategorization frames of verbs and ontologies from syntactic parsing of technical texts in natural language. The restrictions of selection in the subcategorization frames are lled by the concepts of the ontology. Applications requiring subcategorization frames and on-tologies are crucial and numerous. The most direct applications are semantic checking of texts and syntactic parsing improvement but also text generation and translation. The input of Asium result from syntactic parsing of texts, they are subcategoriza-tion examples and basic clusters formed by head words that occur with the same verb after the same preposition (or with the same syntactical role). Asium successively aggregates the clusters to form new concepts in the form of a generality graph that represents the ontology of the domain. Subcategorization frames are learned in parallel, so that as concepts are formed, they ll restrictions of selection in the subcategorization frames. Asium method is based on conceptual clustering. First experiments have been performed on a corpus of cooking recipes and give very promising results reported here.