Joining Forces Pays Off: Multilingual Joint Word Sense Disambiguation

We present a multilingual joint approach to Word Sense Disambiguation (WSD). Our method exploits BabelNet, a very large multilingual knowledge base, to perform graph-based WSD across different languages, and brings together empirical evidence from these languages using ensemble methods. The results show that, thanks to complementing wide-coverage multilingual lexical knowledge with robust graph-based algorithms and combination methods, we are able to achieve the state of the art in both monolingual and multilingual WSD settings.

[1]  Christiane Fellbaum,et al.  English Tasks: All-Words and Verb Lexical Sample , 2001, *SEMEVAL.

[2]  Giovanni Semeraro,et al.  UBA: Using Automatic Translation and Wikipedia for Cross-Lingual Lexical Substitution , 2010, SemEval@ACL.

[3]  Nancy Ide,et al.  © 1999 Kluwer Academic Publishers. Printed in the Netherlands Cross-lingual Sense Determination: Can It Work? , 2022 .

[4]  Marine Carpuat,et al.  Improving Statistical Machine Translation Using Word Sense Disambiguation , 2007, EMNLP.

[5]  Zeynep Orhan,et al.  SemEval-2007 Task 12: Turkish Lexical Sample Task , 2007, SemEval@ACL.

[6]  Bernardo Magnini,et al.  The Italian lexical sample task at Senseval-3 , 2004 .

[7]  Pushpak Bhattacharyya,et al.  CFILT: Resource Conscious Approaches for All-Words Domain Specific WSD , 2010, SemEval@ACL.

[8]  Martine De Cock,et al.  ParaSense or How to Use Parallel Corpora for Word Sense Disambiguation , 2011, ACL.

[9]  Marcello Federico,et al.  Using Bilingual Parallel Corpora for Cross-Lingual Textual Entailment , 2011, ACL.

[10]  Alon Itai,et al.  Two Languages Are More Informative Than One , 1991, ACL.

[11]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[12]  Lluís Màrquez i Villodre,et al.  Senseval-3: The Spanish lexical sample task , 2004, SENSEVAL@ACL.

[13]  Manabu Okumura,et al.  SemEval-2010 Task: Japanese WSD , 2010, SemEval@ACL.

[14]  Simone Paolo Ponzetto,et al.  Knowledge-Rich Word Sense Disambiguation Rivaling Supervised Systems , 2010, ACL.

[15]  Piek T. J. M. Vossen,et al.  SemEval-2010 Task 17: All-Words Word Sense Disambiguation on a Specific Domain , 2009, *SEMEVAL.

[16]  Hwee Tou Ng,et al.  Scaling Up Word Sense Disambiguation via Parallel Texts , 2005, AAAI.

[17]  Rada Mihalcea,et al.  Word Sense Disambiguation with Multilingual Features , 2011, IWCS.

[18]  Roberto Navigli,et al.  Word sense disambiguation: A survey , 2009, CSUR.

[19]  Hwee Tou Ng,et al.  Word Sense Disambiguation Improves Statistical Machine Translation , 2007, ACL.

[20]  Mark Stevenson,et al.  IIITH: Domain Specific Word Sense Disambiguation , 2010, SemEval@ACL.

[21]  Claire Cardie,et al.  Joint Bilingual Sentiment Classification with Unlabeled Parallel Corpora , 2011, ACL.

[22]  K. J. Evans,et al.  Computer Intensive Methods for Testing Hypotheses: An Introduction , 1990 .

[23]  Eneko Agirre,et al.  Two birds with one stone: learning semantic models for text categorization and word sense disambiguation , 2011, CIKM '11.

[24]  Roberto Navigli,et al.  The English lexical substitution task , 2009, Lang. Resour. Evaluation.

[25]  David Yarowsky,et al.  Combining Classifiers for word sense disambiguation , 2002, Nat. Lang. Eng..

[26]  Simone Paolo Ponzetto,et al.  Multilingual WSD with Just a Few Lines of Code: the BabelNet API , 2012, ACL.

[27]  Philip Resnik,et al.  Word Sense Disambiguation within a Multilingual Framework , 2003 .

[28]  Robert L. Mercer,et al.  Word-Sense Disambiguation Using Statistical Methods , 1991, ACL.

[29]  Nancy Ide,et al.  Sense Discrimination with Parallel Corpora , 2002, SENSEVAL.

[30]  Simone Paolo Ponzetto,et al.  BabelNet: Building a Very Large Multilingual Semantic Network , 2010, ACL.

[31]  ResnikPhilip,et al.  Distinguishing systems and distinguishing senses: new evaluation methods for Word Sense Disambiguation , 1999 .

[32]  Gerhard Weikum,et al.  MENTA: inducing multilingual taxonomies from wikipedia , 2010, CIKM '10.

[33]  Véronique Hoste,et al.  SemEval-2010 Task 3: Cross-Lingual Word Sense Disambiguation , 2010, SemEval@ACL.

[34]  李幼升,et al.  Ph , 1989 .

[35]  Michael Strube,et al.  WikiNet: A Very Large Scale Multi-Lingual Concept Network , 2010, LREC.

[36]  Carina Silberer,et al.  UHD: Cross-Lingual Word Sense Disambiguation Using Multilingual Co-Occurrence Graphs , 2010, *SEMEVAL.

[37]  Rada Mihalcea,et al.  SemEval-2010 Task 2: Cross-Lingual Lexical Substitution , 2009, SemEval@ACL.

[38]  George A. Miller,et al.  A Semantic Concordance , 1993, HLT.

[39]  Mirella Lapata,et al.  Ensemble Methods for Unsupervised WSD , 2006, ACL.

[40]  Mirella Lapata,et al.  An Experimental Study of Graph Connectivity for Unsupervised Word Sense Disambiguation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Simone Paolo Ponzetto,et al.  BabelRelate! A Joint Multilingual Approach to Computing Semantic Relatedness , 2012, AAAI.

[42]  Martha Palmer,et al.  The English all-words task , 2004, SENSEVAL@ACL.

[43]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[44]  Pushpak Bhattacharyya,et al.  Together We Can: Bilingual Bootstrapping for WSD , 2011, ACL.

[45]  Kenneth Ward Church,et al.  Using bilingual materials to develop word sense disambiguation methods , 1992, TMI.

[46]  Iryna Gurevych,et al.  OntoWiktionary – Constructing an Ontology from the Collaborative Online Dictionary Wiktionary , 2012 .