SALDO: a touch of yin to WordNet’s yang

The English-language Princeton WordNet (PWN) and some wordnets for other languages have been extensively used as lexical–semantic knowledge sources in language technology applications, due to their free availability and their size. The ubiquitousness of PWN-type wordnets tends to overshadow the fact that they represent one out of many possible choices for structuring a lexical–semantic resource, and it could be enlightening to look at a differently structured resource both from the point of view of theoretical–methodological considerations and from the point of view of practical text processing requirements. The resource described here—SALDO—is such a lexical–semantic resource, intended primarily for use in language technology applications, and offering an alternative organization to PWN-style wordnets. We present our work on SALDO, compare it with PWN, and discuss some implications of the differences. We also describe an integrated infrastructure for computational lexical resources where SALDO forms the central component.

[1]  Lennart Lönngren,et al.  A Swedish Associative Thesaurus , 1998 .

[2]  Viggo Kann,et al.  Free construction of a free Swedish dictionary of synonyms , 2005, NODALIDA.

[3]  Christiane Fellbaum,et al.  Co-Occurrence and Antonymy , 1995 .

[4]  Jurij D. Apresjan,et al.  Principles of Systematic Lexicography , 2008 .

[5]  Jordan L. Boyd-Graber,et al.  Adding dense, weighted connections to WordNet , 2005 .

[6]  Markus Forsberg,et al.  Korp — the corpus infrastructure of Språkbanken , 2012, LREC.

[7]  Piek Vossen,et al.  EuroWordNet: A multilingual database with lexical semantic networks , 1998, Springer Netherlands.

[8]  Lars Borin,et al.  Med Zipf mot framtiden - en integrerad lexikonresurs för svensk språkteknologi , 2010 .

[9]  Katrin Erk,et al.  What Is Word Meaning, Really? (And How Can Distributional Models Help Us Describe It?) , 2010 .

[10]  Maria Bittner,et al.  Cross-linguistic semantics , 1994 .

[11]  Gil Francopoulo,et al.  LMF lexical markup framework , 2013 .

[12]  Graeme Hirst,et al.  Non-Classical Lexical Semantic Relations , 2004, Proceedings of the HLT-NAACL Workshop on Computational Lexical Semantics - CLS '04.

[13]  Lennart Lönngren,et al.  Lexika, baserade på semantiska relationer (Lexica, based on semantic relations) [In Swedish] , 1988, NODALIDA.

[14]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[15]  A. Wierzbicka Semantics: Primes and Universals , 1996 .

[16]  Markus Forsberg,et al.  Three Tools for Language Processing: BNF Converter, Functional Morphology, and Extract , 2007 .

[17]  Lars Borin,et al.  Core Vocabulary: A Useful But Mystical Concept in Some Kinds of Linguistics , 2012, Shall We Play the Festschrift Game?.

[18]  Markus Forsberg,et al.  Semantic search in literature as an e-Humanities research tool: CONPLISIT – Consumption patterns and life-style in 19th century Swedish literature , 2011, NODALIDA.

[19]  Andrew Carstairs-McCarthy,et al.  The origins of complex language , 1999 .

[20]  Timothy Baldwin,et al.  Multiword Expressions: A Pain in the Neck for NLP , 2002, CICLing.

[21]  Patrick Hanks,et al.  Do Word Meanings Exist? , 2000, Comput. Humanit..

[22]  Markus Forsberg,et al.  Swesaurus – ett svenskt ordnät med fria tyglar , 2011 .

[23]  Alexandra Y. Aikhenvald,et al.  Language Typology and Syntactic Description: Typological distinctions in word-formation , 2007 .

[24]  Markus Forsberg,et al.  The Past Meets the Present in Swedish FrameNet , 2010 .

[25]  Adam Kilgarriff,et al.  "I Don’t Believe in Word Senses" , 1997, Comput. Humanit..

[26]  H. Boas Contrastive Studies in Construction Grammar , 2010 .

[27]  Claudia Leacock,et al.  Polysemy: Theoretical and Computational Approaches , 2000 .

[28]  Markus Forsberg,et al.  The Hunting of the BLARK – SALDO , a Freely Available Lexical Database for Swedish Language Technology , 2008 .

[29]  Markus Forsberg,et al.  The open lexical infrastructure of Spräkbanken , 2012, LREC.

[30]  Lars Borin,et al.  Mannen är faderns mormor: Svenskt associationslexikon reinkarnerat , 2005 .

[31]  Wolfgang Ullrich Wurzel,et al.  Inflectional Morphology and Naturalness , 1989 .

[32]  Lluís Padró,et al.  Mapping WordNets Using Structural Information , 2000, ACL.

[33]  Christiane Fellbaum,et al.  Nouns in WordNet , 1998 .

[34]  Patrick Paroubek,et al.  LMF Lexical Markup Framework: Francopoulo/LMF Lexical Markup Framework , 2013 .

[35]  L. Murphy Semantic Relations and the Lexicon , 2003 .