Survey of the word sense disambiguation and challenges for the Slovak language

The main goal of this paper is to explain important terms of the word sense disambiguation (WSD) in the Slovak language. A comprehensive survey of current approaches and evaluation methodologies is provided. Special attention is given to necessary language resources and tools. The paper deals with problems specific to Slovak language: missing language resources, rich morphology, free word order and their solutions. Research directions of Slovak WSD system utilizing available language resources and unsupervised approaches are explained in the conclusion.

[1]  Patrick Pantel,et al.  From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..

[2]  Florentina Hristea,et al.  Feeding Syntactic Versus Semantic Knowledge to a Knowledge-lean Unsupervised Word Sense Disambiguation Algorithm with an Underlying Naïve Bayes Model , 2012, Fundam. Informaticae.

[3]  Cheol-Young Ock,et al.  Margin perceptron for word sense disambiguation , 2010, SoICT.

[4]  Vít Baisa Corpus-based Disambiguation for Machine Translation , 2011, RASLAN.

[5]  Mansour Ahmadi,et al.  SePaS: Word sense disambiguation by sequential patterns in sentences , 2013, Natural Language Engineering.

[6]  Jens Lehmann,et al.  DBpedia - A crystallization point for the Web of Data , 2009, J. Web Semant..

[7]  Eneko Agirre,et al.  Semeval-2007 Task 2 : Evaluating Word Sense Induction and Discrimination , 2007 .

[8]  Jozef Juhar,et al.  Semantic roles labeling system for Slovak sentences , 2014, 2014 5th IEEE Conference on Cognitive Infocommunications (CogInfoCom).

[9]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[10]  Ganesh Chandra,et al.  A Literature Survey on Various Approaches of Word Sense Disambiguation , 2014, 2014 2nd International Symposium on Computational and Business Intelligence.

[11]  Sergey Brin,et al.  Reprint of: The anatomy of a large-scale hypertextual web search engine , 2012, Comput. Networks.

[12]  Hwee Tou Ng,et al.  Word Sense Disambiguation Improves Information Retrieval , 2012, ACL.

[13]  Sam Steingold,et al.  A search based approach to entity recognition: magnetic and IISAS team at ERD challenge , 2014, ERD '14.

[14]  Kiril Ivanov Simov,et al.  Improving Word Sense Disambiguation with Linguistic Knowledge from a Sense Annotated Treebank , 2015, RANLP.

[15]  Adam Kilgarriff,et al.  The Senseval-3 English lexical sample task , 2004, SENSEVAL@ACL.

[16]  Jozef Juhár,et al.  The Slovak Categorized News Corpus , 2014, LREC.

[17]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[18]  Cheol-Young Ock,et al.  Word sense disambiguation as a traveling salesman problem , 2013, Artificial Intelligence Review.

[19]  Eneko Agirre,et al.  Word Sense Disambiguation: Algorithms and Applications , 2007 .

[20]  Prema Nedungadi,et al.  Unsupervised Word Sense Disambiguation for Automatic Essay Scoring , 2014 .

[21]  Vladimír Benko,et al.  Aranea: Yet Another Family of (Comparable) Web Corpora , 2014, TSD.

[22]  Andrew Skabar,et al.  Unsupervised similarity-based word sense disambiguation using context vectors and sentential word importance , 2012, TSLP.

[23]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[24]  Eneko Agirre,et al.  Personalizing PageRank for Word Sense Disambiguation , 2009, EACL.

[25]  Piek T. J. M. Vossen,et al.  Topic Modeling and Word Sense Disambiguation on the Ancora corpus , 2015, Proces. del Leng. Natural.

[26]  Sabrina Tiun,et al.  Word sense disambiguation based on yarowsky approach in english quranic information retrieval system , 2015 .

[27]  Tong Wang,et al.  Applying a Naive Bayes Similarity Measure to Word Sense Disambiguation , 2014, ACL.

[28]  George A. Miller,et al.  A Semantic Concordance , 1993, HLT.

[29]  Benoît Sagot,et al.  Constructing a poor man’s wordnet in a resource-rich world , 2015, Language Resources and Evaluation.

[30]  Daniel Hladek,et al.  Morphological Analysis of the Slovak Language , 2015 .

[31]  Claire Cardie,et al.  SimCompass: Using Deep Learning Word Embeddings to Assess Cross-level Similarity , 2014, *SEMEVAL.

[32]  Diana McCarthy Word Sense Disambiguation: An Overview , 2009, Lang. Linguistics Compass.

[33]  Yuji Matsumoto,et al.  Using the Mutual k-Nearest Neighbor Graphs for Semi-supervised Classification on Natural Language Data , 2011, CoNLL.

[34]  David Yarowsky,et al.  Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.

[35]  Ondrej Bojar,et al.  Valency Lexicon of Czech Verbs VALLEX: Recent Experiments with Frame Disambiguation , 2005, TSD.

[36]  Jan Snajder,et al.  TakeLab: Systems for Measuring Semantic Text Similarity , 2012, *SEMEVAL.

[37]  Eneko Agirre,et al.  Random Walks for Knowledge-Based Word Sense Disambiguation , 2014, CL.

[38]  Roberto Navigli,et al.  Word sense disambiguation: A survey , 2009, CSUR.