Romanian Lexical Data Bases: Inflected and Syllabic Forms Dictionaries
暂无分享,去创建一个
This paper presents two lexical data bases for Romanian: RoMorphoDict, a dictionary of inflected forms and RoSyllabiDict, a dictionary of syllabified inflected forms. Each data basis is available in two Unicode formats: text and XML. An entry of RoMorphoDict, in text format, contains information on inflected form, its lemma, its morpho-syntactic description and the marking of the stressed vowel in pronunciation, while in XML format, an entry, representing the whole paradigm of a word, contains further informations about roots and paradigm class. An entry of RoSyllabiDict, in both formats, contains information about unsyllabified word, its syllabified correspondent, grammatical information and/or type of syllabification, if it is the case. The stressed vowel is also marked on the syllabified form. Each lexical data base includes the corresponding inflected forms of about 65.000 lemmas, that is, over 700.000 entries in RoMorphoDict, and over 500.000 entries in RoSyllabiDict. Both resources are available for free. The paper discribes in detail the content of these data bases and the procedure of building them.
[1] Dana-Marina Dumitriu. Grammaires de flexion des adjectifs roumains par automates finis , 2006 .
[2] Liviu P. Dinu. ON THE QUANTITATIVE AND FORMAL ASPECTS OF THE ROMANIAN SYLLABLES , 2006 .
[3] Camelia Firica. Elemente de noutate in Dictionarul ortografic, ortoepic şi morfologic al limbii române , 2008 .
[4] Liviu P. Dinu. An Approach to Syllables via some Extensions of Marcus Contextual Grammars , 2003, Grammars.