In this paper, we propose the automatic bootstrapping of a Modern Standard Arabic WordNet on the lexeme level using Arabic English parallel corpora and an English WordNet. We address the feasibility of such an endeavor and present a qualitative evaluation of the meaning correspondences cross linguistically between Arabic and English. We further present an automatic means of performing this task using an unsupervised Word Sense Disambiguation System. We test the feasibility of the bootstrapping by qualitatively evaluating the meaning definition projection of English words onto their Arabic translations. We manually evaluate 447 word instances of the Arabic words that correspond to correctly sense tagged English words using English WordNet 1.7. from the SENSEVAL 3 data. The words evaluated correspond to Nouns, verbs, adjectives in English. We find that for Arabic verbs, adjectives and nouns, on average 52.3% of all the words examined, the corresponding English WordNet set of definitions are sufficient as definitions for the Arabic translation word; 39.96% of the Arabic words correspond to specific subsets of the WordNet definitions; and finally, 7.8% of the Arabic words comprise supersets of their corresponding English WordNet translation definitions. These results are very encouraging as they are similar to those obtained by researchers building EuroWordNet.
[1]
Jason Eisner,et al.
Lexical Semantics
,
2020,
The Handbook of English Linguistics.
[2]
Johnathan E. Avery,et al.
Semantic Memory
,
2019,
Psychology.
[3]
Daniel Jurafsky,et al.
Automatic Tagging of Arabic Text: From Raw Text to Base Phrase Chunks
,
2004,
NAACL.
[4]
Kareem Darwish,et al.
Building a Shallow Arabic Morphological Analyser in One Day
,
2002,
SEMITIC@ACL.
[5]
Christiane Fellbaum,et al.
Book Reviews: WordNet: An Electronic Lexical Database
,
1999,
CL.
[6]
Mona T. Diab,et al.
An Unsupervised Method for Multilingual Word Sense Tagging Using Parallel Corpora
,
2000,
ACL 2000.
[7]
Philip Resnik,et al.
Disambiguating Noun Groupings with Respect to Wordnet Senses
,
1995,
VLC@ACL.
[8]
Philip Resnik,et al.
Word Sense Disambiguation within a Multilingual Framework
,
2003
.
[9]
Yorick Wilks,et al.
Cross-linguistic Discovery of Semantic Regularity
,
2002
.
[10]
Mona T. Diab,et al.
Exploiting translations for semantic annotation
,
2001
.
[11]
Julio Gonzalo,et al.
Towards a Universal Index of Meaning
,
1999
.