Contribution of sublexical information to word meaning: An objective approach using latent semantic analysis and corpus analysis on predicates Keisuke Inohara (kei.inohara@gmail.com) Taiji Ueno (taijiueno7@gmail.com) 1. Department of Informatics, Graduate School of Informatics and Engineering, The University of Electro-Communications, 1-5-1 Chofugaoka, Chofu, Tokyo, 182-8585, JAPAN 1. Department of Psychology, Graduate School of Environmental Studies, Nagoya University, Furo-cho, Chikusa-ku, Nagoya City, Aichi 4648601, JAPAN 2. Japan Society for the Promotion of Science less extent, English also shares this issue as it does not code each vowel with one specific phoneme). Although a word meaning is a word-specific type (i.e., whole-word) of knowledge, sub-lexical information also contributes to its computation. Evidence for this has been accumulated in alphabetic languages (Libben, 1998; Marslen-Wilson, Tyler, Waksler, & Older, 1994), and to a greater extent in logographic languages (Hino et al., 2011; Shu et al., 2003). Further insight on this issue has been gleaned by investigating the role of sub-character (visual) information (e.g., radicals) in Chinese and Japanese kanji (Ogawa, 2013; Shu et al., 2003; Tamaoka, 2005). For example, a native Japanese speaker would agree that the kanji characters 洗 (wash) and 流 (flow) share similar meanings (e.g., water) because these characters share a radical (left part of each character). However, the outcomes of scientific investigations are not consistent with the role of radicals on computing character/word meanings. Specifically, Hino et al. (2011) concluded that orthographic neighbors in kanji compounds (sharing one/two kanji characters, thereby sharing radicals as well) tend to have similar meanings, but the degree of the shared meaning was not different from that of orthographic neighbors in kana (another type of orthography without a radical, which codes phoneme/mora in Japanese), suggesting that the existence of a radical in kanji is not particularly helpful in computing word meanings. Furthermore, other studies suggest there are some exceptional characters whose meanings are different from other characters with the same radicals (Ogawa, 2013; Shu et al., 2003; Tamaoka, 2005). Therefore, it remains unclear whether sub-character information in Chinese/kanji contributes to character/word meaning. These mixtures of outcomes may stem from the way in which word/character meanings (and semantic similarity) were measured. Specifically, all of these studies employed a subjective rating, such as asking (in a Likert scale) how radical/character neighborhoods are similar in meaning (Hino et al., 2011), or asking or categorizing to what extent each character meaning is consistent with its radical meaning (e.g., in case of the example above, how meaning of 洗 is relevant to water) (Ogawa, 2013; Shu et al., 2003; Tamaoka, 2005). Theses subjective ratings/categorizations could be affected by demand characteristics (Orne, 1962), and by the list composition. Therefore, in this study, we aimed to investigate how radicals predict word meanings by using objective measures of semantic similarity. A closest Abstract Past studies have employed a subjective rating/categorization methodology to investigate whether radicals, an example of sub-lexical visual information in Chinese/kanji, contribute to computation of character/word meaning, with conflicting results. This study took an objective, corpus-based approach for the first time. Specifically, we conducted a Latent Semantic Analysis based on Japanese newspaper text (Experiment 1), and found that radical friends (kanji characters with the same radicals) appeared in more similar linguistic contexts than radical enemies (kanji characters that do not include the same radicals). In addition, we consulted a noun-verb predicate corpus extracted from Japanese web texts (Experiment 2), and showed that nouns including radical friends tended to take more similar predicates than nouns with radical enemies. These findings suggest that characters/words with similar meanings tend to share radicals in kanji, which may explain how children are able to efficiently learn to use the vast number of characters in Chinese/Japanese. Keywords: semantic radical; latent semantic analysis; predicates; orthography; semantics Introduction How word meanings are computed from orthography and phonology (and vice versa) has been a central issue in the psycholinguistic literature. For example, neurocognitive theories differ in whether reading aloud (orthography-phonology mapping) necessarily involves a computation of word meaning (orthography → meaning → phonology) or not (Coltheart, Rastle, Perry, Langdon, & Ziegler, 2001; Plaut, McClelland, Seidenberg, & Patterson, 1996). To address this issue, one first needs to understand how a word meaning is computed from its written form (Harm & Seidenberg, 2004). Computation of word meaning is also a practical concern in child language acquisition/teaching of both alphabetic (Rayner, Foorman, Perfetti, Pesetsky, & Seidenberg, 2001) and non-alphabetic (Hino, Miyamura, & Lupker, 2011) languages. Learning to spell/read a vast number of Chinese characters and Japanese kanji adaptations from Chinese is a demanding problem (Shu, Chen, Anderson, Wu, & Xuan, 2003; Tamaoka & Yamada, 2000). For example, there are 2,136 Japanese kanji characters designated for everyday use. In teaching so many items, a particular emphasis on lexical/semantic associations with orthography/phonology might be effective (NB. To a
[1]
Gary Libben,et al.
Semantic Transparency in the Processing of Compounds: Consequences for Representation, Processing, and Impairment
,
1998,
Brain and Language.
[2]
James L. McClelland,et al.
A distributed, developmental model of word recognition and naming.
,
1989,
Psychological review.
[3]
Peng Jin,et al.
Distributional Similarity for Chinese: Exploiting Characters and Radicals
,
2012
.
[4]
Hiroyuki Yamada,et al.
The effects of stroke order and radicals on the knowledge of Japanese kanji orthography, phonology and semantics.
,
2000
.
[5]
M. Orne.
On the social psychology of the psychological experiment: With particular reference to demand characteristics and their implications.
,
1962
.
[6]
Jeffrey L. Elman,et al.
Finding Structure in Time
,
1990,
Cogn. Sci..
[7]
M Coltheart,et al.
DRC: a dual route cascaded model of visual word recognition and reading aloud.
,
2001,
Psychological review.
[8]
James L. McClelland,et al.
Understanding normal and impaired word reading: computational principles in quasi-regular domains.
,
1996,
Psychological review.
[9]
T. Shallice,et al.
Deep Dyslexia: A Case Study of
,
1993
.
[10]
Richard C. Anderson,et al.
Properties of school Chinese: implications for learning to read.
,
2003,
Child development.
[11]
T. Landauer,et al.
A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge.
,
1997
.
[12]
R. Rosenthal.
Meta-analytic procedures for social research
,
1984
.
[13]
W. Marslen-Wilson,et al.
Morphology and meaning in the English mental lexicon.
,
1994
.
[14]
Mark S. Seidenberg,et al.
Computing the meanings of words in reading: cooperative division of labor between visual and phonological processes.
,
2004,
Psychological review.
[15]
Gregory V. Jones.
Deep dyslexia, imageability, and ease of predication
,
1985,
Brain and Language.
[16]
S. Lupker,et al.
The nature of orthographic–phonological and orthographic–semantic relationships for Japanese kana and kanji words
,
2011,
Behavior research methods.
[17]
Mark S. Seidenberg,et al.
PSYCHOLOGICAL SCIENCE IN THE PUBLIC INTEREST HOW PSYCHOLOGICAL SCIENCE INFORMS THE TEACHING OF READING
,
2022
.