Using latent semantic analysis to assess knowledge: Some technical considerations

In another article (Wolfe et al., 1998/this issue) we showed how Latent Semantic Analysis (LSA) can be used to assess student knowledge—how essays can be graded by LSA and how LSA can match students with appropriate instructional texts. We did this by comparing an essay written by a student with one or more target instructional texts in terms of the cosine between the vector representation of the student's essay and the instructional text in question. This simple method was effective for the purpose, but questions remain about how LSA achieves its results and how the results might be improved. Here, we address four such questions: (a) What role does the use of technical vocabulary play? (b) how long should the student essays be? (c) is the cosine the optimal measure of semantic relatedness? and (d) how does one deal with the directionality of knowledge in the high‐dimensional space?