论文信息 - Data Category Registry: Morpho-syntactic and Syntactic Profiles

Data Category Registry: Morpho-syntactic and Syntactic Profiles

After a brief presentation of the data model, we describe a work in progress to define an initial set of morpho-syntactic and syntactic data categories dedicated to NLP applications. The aim is to improve interoperability among language resources and to optimize the process leading to their integration in applications. The main point is to be sure that when a language resource makes use of a value, the other language resources and programs have the same interpretation for this given value. From a practical point of view, these values are collected from existing lists, discussed, extended, and then recorded within a freely accessible data base: the ISO Data Category Registry.

Thierry Declerck | Virach Sornlertlamvanich | Gil Francopoulo | Éric Villemonte de la Clergerie | Monica Monachini

[1] Thierry Declerck. SynAF: Towards a Standard for Syntactic Annotation , 2006, LREC.

[2] Chu-Ren Huang,et al. Constructing Taxonomy of Numerative Classifiers for Asian Languages , 2008, IJCNLP.

[3] Nancy Ide,et al. A Registry of Standard Data Categories for Linguistic Annotation , 2004, LREC.

[4] Claudia Soria,et al. Lexical Markup Framework (LMF) , 2006, LREC.

[5] Chu-Ren Huang,et al. Infrastructure for Standardization of Asian Language Resources , 2006, ACL.

[6] Sue Ellen Wright. A Global Data Category Registry for Interoperable Language Resources , 2004, LREC.