THEORIES AND APPLICATIONS OF HIGH-DIMENSIONAL SEMANTIC MODELS Automatically deriving readers' knowledge structures from texts

Latent semantic analysis (LSA) serves as both a theory and a method for representing the meaning of words based on a statistical analysis of their contextual usage (Foltz, 1996; Landauer & Dumais, 1997), In experiments in the domains of psychology and history, we compared the representation of readers' knowledge structures of information learned from texts with the representation generated by LSA. Results indicated that LSA's representation is similar to readers' representations. In addition, the degree to which the reader's representation is similar to LSA's representation is indicative of the amount of knowledge the reader has acquired and of the reader's reading ability. This approach has implications both as a model of learning from text and as a practical tool for performing knowledge assessment. The acquisition ofknowledge requires both the acqui­ sition ofa set ofconcepts related to that knowledge and an understanding ofthe relationships among those concepts. Together, the concepts and the relationships among them form a representation ofthe learner's knowledge structure ofa topic. By assessing a learner's characterization ofthe relationships among concepts, we can measure that per­ son's knowledge structures. In addition, correlations of these relationships among multiple learners can be used to characterize the similarity ofthe knowledge structures between participants. This approach permits a researcher to analyze the effect a particular text may have on a read­ er's knowledge structures, which may in turn be used to determine what characteristics of the text had particular effects on learning. In this paper, we present a method of automatically deriving a knowledge structure based on statistical analysis ofa text and assess how well this knowl­ edge structure corresponds to the knowledge structures formed by a human reader of that text. The development of semantic models of memory has long relied on using psychometric approaches and a large number ofparticipants' judgments to determine the rela­ tionships among concepts (see, e.g., Osgood, Suci, & Tan­ nenbaum, 1957). With the advent ofmore powerful com