Finding a Location for a New Word in WordNet

FinnWordNet is a Finnish wordnet which complies with the structure of the Princeton WordNet (Fellbaum, 1998). It was created by translating all the words in Princeton WordNet. It is open source and contains over 117 000 synsets. We are now testing different methods in order to improve and expand the content of FinnWordNet. Since wordnets are structured ontologies, a location for a word in FinnWordNet can be pinpointed by its relations to other words. To us, finding a location for a word therefore means finding a hyperonym, a hyponym or a synonym for the word. This article describes some methods for finding a location for a new word in FinnWordNet. Our methods include searching for multiword terms, compounds and lexicosyntactic patterns. Testing shows that with a few simple methods, we were able to find an indicator of the location for 83.2 % of new words. Out of the new synonym pairs we tested, we were able to find an indication for 86.7 %.

[1]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[2]  Eugene Charniak,et al.  Finding Parts in Very Large Corpora , 1999, ACL.

[3]  Luis Gravano,et al.  Snowball: extracting relations from large plain-text collections , 2000, DL '00.

[4]  Dan I. Moldovan,et al.  Learning Semantic Constraints for the Automatic Discovery of Part-Whole Relations , 2003, NAACL.

[5]  Daniel Jurafsky,et al.  Learning Syntactic Patterns for Automatic Hypernym Discovery , 2004, NIPS.

[6]  Jussi Olavi Piitulainen,et al.  Explorations in the distributional and semantic similarity of words , 2011 .

[7]  Dekang Lin,et al.  Automatic Retrieval and Clustering of Similar Words , 1998, ACL.

[8]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[9]  Jussi Piitulainen,et al.  Discovering Synonyms and Other Related Words , 2004 .

[10]  Eduard Hovy,et al.  Towards terascale knowledge acquisition , 2004, COLING 2004.

[11]  Marti A. Hearst Automated Discovery of WordNet Relations , 2004 .

[12]  D. Tufis,et al.  BalkaNet : Aims , Methods , Results and Perspectives . A General Overview , 2004 .

[13]  Sergey Brin,et al.  Extracting Patterns and Relations from the World Wide Web , 1998, WebDB.

[14]  Sharon A. Caraballo Automatic construction of a hypernym-labeled noun hierarchy from text , 1999, ACL.

[15]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[16]  Dan I. Moldovan,et al.  Automatic Discovery of Part-Whole Relations , 2006, CL.

[17]  Zellig S. Harris,et al.  Mathematical structures of language , 1968, Interscience tracts in pure and applied mathematics.

[18]  Patrick Pantel,et al.  Discovering word senses from text , 2002, KDD.

[19]  Eduard H. Hovy,et al.  Learning surface text patterns for a Question Answering System , 2002, ACL.

[20]  Luis Gravano,et al.  Snowball: a prototype system for extracting relations from large text collections , 2001, SIGMOD '01.

[21]  Patrick Pantel,et al.  Espresso: Leveraging Generic Patterns for Automatically Harvesting Semantic Relations , 2006, ACL.

[22]  Patrick Pantel,et al.  Concept Discovery from Text , 2002, COLING.

[23]  Piek Vossen,et al.  EuroWordNet: A multilingual database with lexical semantic networks , 1998, Springer Netherlands.