The Predictive Power of the (Micro)Context Revisited – Behavioral Profiling and Word Sense Disambiguation.

One of the most pressing issues in lexical semantics is the lack of solid empirical (linguistic) criteria in accounting for sense distinction. The methodology explored in the current paper starts from the premise that, if not a complete, then at least a largely criteria-based account of word senses is possible by approaching word sense discrimination through a combination of supervised and unsupervised WSD. We aim to test this claim, which has also been raised by the relatively recently re-emerging corpus-based bag-of-words approaches to WSD. The paper concludes that using exclusively the criteria of linguistic (microcontextual) data is not sufficient, distinctive, and useful for successful WSD and reveals a solely linguistic account as not applicable in practice.

[1]  Hwee Tou Ng,et al.  Evaluation of WSD Systems , 2007 .

[2]  William Croft,et al.  Radical Construction Grammar: Syntactic Theory in Typological Perspective , 2001 .

[3]  Stefan Thomas Gries,et al.  Statistics for linguistics with R: A practical introduction (review) , 2012 .

[4]  Stefan Th. Gries,et al.  Ways of trying in Russian: clustering behavioral profiles , 2006, Corpus Linguistics and Linguistic Theory.

[5]  Olga Babko-Malaya,et al.  Different Sense Granularities for Different Applications , 2004, HLT-NAACL 2004.

[6]  David Yarowsky,et al.  A method for disambiguating word senses in a large corpus , 1992, Comput. Humanit..

[7]  Hinrich Schütze,et al.  Automatic Word Sense Discrimination , 1998, Comput. Linguistics.

[8]  Roberto Navigli,et al.  Word sense disambiguation: A survey , 2009, CSUR.

[9]  Julio Gonzalo,et al.  The role of named entities in Web People Search , 2009, EMNLP.

[10]  Hwee Tou Ng,et al.  Exemplar-Based Word Sense Disambiguation” Some Recent Improvements , 1997, EMNLP.

[11]  J. Firth Papers in linguistics , 1958 .

[12]  Nancy Ide,et al.  Introduction to the Special Issue on Word Sense Disambiguation: The State of the Art , 1998, Comput. Linguistics.

[13]  Nikola Dobric,et al.  Word Sense Disambiguation Using ID Tags - Identifying Meaning in Polysemous Words in English , 2010 .

[14]  J. Bresnan Lexical-Functional Syntax , 2000 .

[15]  W. Teubert My version of corpus linguistics , 2005 .

[16]  Abraham Kaplan,et al.  An experimental study of ambiguity and context , 1955, Mech. Transl. Comput. Linguistics.

[17]  J. Richard Landis,et al.  Large sample variance of kappa in the case of different sets of raters. , 1979 .

[18]  Adam Kilgarriff,et al.  How Dominant Is the Commonest Sense of a Word? , 2004, TSD.

[19]  Stefan Th. Gries,et al.  Behavioral profiles A fine-grained and quantitative approach in corpus-based lexical semantics , 2011 .

[20]  Michael Halliday,et al.  An Introduction to Functional Grammar , 1985 .

[21]  Ana Guerberof Arenas Exploring Machine Translation on the Web , 2010 .

[22]  David Yarowsky,et al.  Estimating Upper and Lower Bounds on the Performance of Word-Sense Disambiguation Programs , 1992, ACL.

[23]  Kari Tenfjord,et al.  The "Hows" and the "Whys" of Coding Categories in a Learner Corpus (or "How and Why an Error-Tagged Learner Corpus is not 'ipso facto' One Big Comparative Fallacy") , 2006 .

[24]  Yorick Wilks,et al.  Making Sense About Sense , 2007 .

[25]  Alessandro Lenci,et al.  Distributional Memory: A General Framework for Corpus-Based Semantics , 2010, CL.

[26]  S. Gries,et al.  Behavioral profiles: A corpus-based approach to cognitive semantic analysis , 2009 .

[27]  Joan L. Bybee,et al.  A FUNCTIONALIST APPROACH TO GRAMMAR AND ITS EvoLUTION , 1998 .

[28]  Dirk Geeraerts,et al.  Theories of Lexical Semantics , 2010 .

[29]  Martha Palmer,et al.  Criteria for the Manual Grouping of Verb Senses , 2007, LAW@ACL.

[30]  Stefan Th. Gries,et al.  Corpus-based methods and cognitive semantics: The many senses of to run , 2005 .

[31]  Mark Davies,et al.  The Corpus of Contemporary American English as the first reliable monitor corpus of English , 2010, Lit. Linguistic Comput..