Acquiring lexical knowledge using raw corpora and unsupervised clustering method

In this paper, we propose a computational model for automatic acquisition of lexical knowledge based on the principles of human language information processing. The proposed model assumes a hybrid model for the human lexical representation including full-list and decomposition forms. The proposed method automatically acquires lexical entries and its grammatical knowledge by unsupervised learning techniques. For the purposes of evaluating performance of the proposed method, a large-scale corpus of over 10 million lexical was used, the lexical knowledge acquisition process was tested, and the results were analyzed.

[1]  M. Aronoff,et al.  Producing morphologically complex words , 1988 .

[2]  Dedre Gentner,et al.  Why Nouns Are Learned before Verbs: Linguistic Relativity Versus Natural Partitioning. Technical Report No. 257. , 1982 .

[3]  L. Manelis,et al.  The processing of affixed words , 1977, Memory & cognition.

[4]  Fang Dong,et al.  A context-aware personalized resource recommendation for pervasive learning , 2010, Cluster Computing.

[5]  Conrad Perry,et al.  The DRC model of visual word recognition and reading aloud: An extension to German , 2000 .

[6]  R. F. Stanners,et al.  Memory representation for morphologically related words. , 1979 .

[7]  W. Marslen-Wilson,et al.  Morphology and meaning in the English mental lexicon. , 1994 .

[8]  Mary C. Potter,et al.  Rapid serial visual presentation (rsvp): a method for studying language processing , 2018 .

[9]  Heuiseok Lim,et al.  Unsupervised lexical entry acquisition model based on representation of human mental lexicon , 2011 .

[10]  Donald G. MacKay,et al.  Derivational rules and the internal lexicon , 1978 .

[11]  Alessandro Laudanna,et al.  Reading mechanisms and the organization of the lexicon: evidence from phonological dyslexia. , 1985 .

[12]  A. Caramazza,et al.  Reading mechanisms and the organisation of the lexicon: Evidence from acquired dyslexia , 1985 .

[13]  J. Morton,et al.  The effects of priming with regularly and irregularly related words in auditory word recognition. , 1982, British journal of psychology.

[14]  James L. McClelland,et al.  An interactive activation model of context effects in letter perception: part 1.: an account of basic findings , 1988 .

[15]  Max Coltheart,et al.  Commentary on Section 3 – Dual Routes from Print to Speech and Dual Routes from Print to Meaning: Some Theoretical Issues , 2000 .

[16]  Andrew Sohn,et al.  Autonomous learning of load and traffic patterns to improve cluster utilization , 2007, Cluster Computing.

[17]  Pilsung Kang Modular implementation of dynamic algorithm switching in parallel simulations , 2012, Cluster Computing.

[18]  M. Taft Recognition of affixed words and the word frequency effect , 1979, Memory & cognition.

[19]  José Neuman de Souza,et al.  The design of a novel context-aware policy model to support machine-based learning and reasoning , 2008, Cluster Computing.

[20]  George Kingsley Zipf,et al.  Relative Frequency as a Determinant of Phonetic Change , 1930 .

[21]  B MacWhinney,et al.  Frequency and the lexical storage of regularly inflected forms , 1986, Memory & cognition.

[22]  James L. McClelland,et al.  An interactive activation model of context effects in letter perception: I. An account of basic findings. , 1981 .

[23]  M. Turvey,et al.  Representation of inflected nouns in the internal lexicon , 1980, Memory & cognition.

[24]  Marcus Taft,et al.  Reading and the Mental Lexicon , 1991 .

[25]  M. Taft Prefix Stripping Revisited. , 1981 .

[26]  K. Forster,et al.  What can we learn from the morphology of Hebrew? A masked-priming investigation of morphological representation. , 1997, Journal of experimental psychology. Learning, memory, and cognition.

[27]  David C. Plaut,et al.  Structure and Function in the Lexical System: Insights from Distributed Models of Word Reading and Lexical Decision , 1997 .

[28]  Antonio J. Plaza,et al.  Parallel morphological/neural processing of hyperspectral images using heterogeneous and homogeneous platforms , 2008, Cluster Computing.

[29]  C. A. Becker,et al.  Morphological structure and its effect on visual word recognition , 1979 .

[30]  Michael Garman,et al.  Psycholinguistics: Accessing the mental lexicon , 1990 .

[31]  Alison Gopnik,et al.  Names, relational words, and cognitive development in English and Korean speakers: Nouns are not always learned before verbs. , 1995 .