Aligning GermaNet Senses with Wiktionary Sense Definitions

Sense definitions are a crucial component for wordnets and enhance the usability of wordnets for a wide variety of NLP applications. Many wordnets for languages other than English – including the German wordnet GermaNet – lack comprehensive coverage of such definitions. The purpose of this paper is to automatically align sense descriptions from the web-based dictionary Wiktionary to lexical units in GermaNet in order to extend GermaNet with sense descriptions. An alignment algorithm based on word overlaps is developed and different setups of the algorithm are compared. This algorithm yields as the best result an accuracy of 93.8 % and an F1-score of 84.3, which confirms the viability of the proposed method for automatically enriching GermaNet. This best result crucially involves the use of coordinated relations as a novel concept for calculating sense alignment.

[1]  Michael E. Lesk,et al.  Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone , 1986, SIGDOC '86.

[2]  Mark Stevenson,et al.  Mapping WordNet synsets to Wikipedia articles , 2012, LREC.

[3]  Erhard W. Hinrichs,et al.  WebCAGe – A Web-Harvested Corpus Annotated with GermaNet Senses , 2012, EACL.

[4]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[5]  Erhard W. Hinrichs,et al.  GernEdiT - The GermaNet Editing Tool , 2010, LREC.

[6]  Ted Pedersen,et al.  Extended Gloss Overlaps as a Measure of Semantic Relatedness , 2003, IJCAI.

[7]  Iryna Gurevych,et al.  The People’s Web meets Linguistic Knowledge: Automatic Sense Alignment of Wikipedia and WordNet , 2011, IWCS.

[8]  Oi Yee Kwong Aligning WordNet with Additional Lexical Resources , 1998, WordNet@ACL/COLING.

[9]  Adam Pease,et al.  Linking Lixicons and Ontologies: Mapping WordNet to the Suggested Upper Merged Ontology , 2003, IKE.

[10]  Maria Ruiz-Casado,et al.  Automatic Assignment of Wikipedia Encyclopedic Entries to WordNet Synsets , 2005, AWIC.

[11]  Kenneth C. Litkowski,et al.  Towards a Meaning-Full Comparison of Lexical Resources , 1999, SIGLEX Workshop On Standardizing Lexical Resources.

[12]  Iryna Gurevych,et al.  What Psycholinguists Know About Chemistry: Aligning Wiktionary and WordNet for Increased Domain Coverage , 2011, IJCNLP.

[13]  M. F. Porter,et al.  An algorithm for suffix stripping , 1997 .

[14]  Janusz Kacprzyk,et al.  Advances in Web Intelligence , 2003, Lecture Notes in Computer Science.

[15]  Simone Paolo Ponzetto,et al.  Knowledge-Rich Word Sense Disambiguation Rivaling Supervised Systems , 2010, ACL.

[16]  Kam-Fai Wong,et al.  Natural Language Processing - IJCNLP 2005, Second International Joint Conference, Jeju Island, Korea, October 11-13, 2005, Proceedings , 2005, IJCNLP.

[17]  Erhard W. Hinrichs,et al.  Automatically Linking GermaNet to Wikipedia for Harvesting Corpus Examples for GermaNet Senses , 2012, J. Lang. Technol. Comput. Linguistics.

[18]  Iryna Gurevych,et al.  Using the Structure of a Conceptual Network in Computing Semantic Relatedness , 2005, IJCNLP.