Computational Analysis versus Human Intuition: A Critical Comparison of Vector Semantics with Manual Semantic Classification in the Context of Plains Cree

A persistent challenge in the creation of semantically classified dictionaries and lexical resources is the lengthy and expensive process of manual semantic classification, a hindrance which can make adequate semantic resources unattainable for under-resourced language communities. We explore here an alternative to manual classification using a vector semantic method, which, although not yet at the level of human sophistication, can provide usable first-pass semantic classifications in a fraction of the time. As a case example, we use a dictionary in Plains Cree (ISO: crk, Algonquian, Western Canada and United States)

[1]  Jyrki Niemi,et al.  Is it possible to create a very large wordnet in 100 days? An evaluation , 2013, Language Resources and Evaluation.

[2]  H. C. Wolfart,et al.  Plains Cree: A Grammatical Study , 1976 .

[3]  J. Giménez,et al.  Automatic Translation of WordNet Glosses , 2022 .

[4]  Erhard W. Hinrichs,et al.  GernEdiT - The GermaNet Editing Tool , 2010, LREC.

[5]  Brenda H. Boerger Rapid Word Collection, dictionary production, and community well-being , 2017 .

[6]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing , 2000 .

[7]  Antti Arppe,et al.  A Preliminary Plains Cree Speech Synthesizer , 2019, Proceedings of the Workshop on Computational Methods for Endangered Languages.

[8]  Lauri Carlson,et al.  FinnWordNet - WordNet på finska via översättning , 2010 .

[9]  Ronald Moe,et al.  Compiling dictionaries using semantic domains , 2010 .

[10]  Antti Arppe,et al.  Univariate, bivariate, and multivariate methods in corpus-based lexicography : A study of synonymy , 2008 .

[11]  M. Lucas,et al.  Semantic priming without association: A meta-analytic review , 2000, Psychonomic bulletin & review.

[12]  T. McNamara Semantic Priming: Perspectives from Memory and Word Recognition , 2005 .

[13]  David Traum,et al.  Exploring a Choctaw Language Corpus with Word Vectors and Minimum Distance Length , 2020, LREC.

[14]  Helmut Feldweg,et al.  GermaNet - a Lexical-Semantic Net for German , 1997 .

[15]  Lene Antonsen,et al.  Learning from the computational modelling of Plains Cree verbs , 2017, Morphology.

[16]  Lene Antonsen,et al.  Building a Constraint Grammar Parser for Plains Cree Verbs and Arguments , 2018, LREC.

[17]  Marissa Griesel,et al.  Strategies for building wordnets for under-resourced languages: The case of African languages , 2017 .

[18]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[19]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[20]  Nick C. Ellis,et al.  Handbook of Cognitive Linguistics and Second Language Acquisition , 2008 .

[21]  Claudia Kunze,et al.  Using WordNets in Teaching Virtual Courses of Computational Linguistics , 2004 .

[22]  Wei Li,et al.  Improving Word Vector with Prior Knowledge in Semantic Dictionary , 2016, NLPCC/ICCPOL.

[23]  J. R. Firth,et al.  A Synopsis of Linguistic Theory, 1930-1955 , 1957 .

[24]  Antti Arppe,et al.  A Morphosyntactically Tagged Corpus for Plains Cree , 2020 .

[25]  J. Firth,et al.  Selected papers of J. R. Firth, 1952-59 , 1968 .