Strategies for Controlling Hypothesis Formation in Reading

Reading is a process of forming and evaluating hypotheses to account for the data in a text. Because of its complexity, the task of reading requires strategies for controlling the proliferation of hypotheses. Four of these strategies, (a) jumping to conclusions, (b) maintaining inertia, (c) relying on background knowledge, and (d) working backwards from the goal, are generally effective, but they occasionally create reading problems, rather than alleviating them. Examples from protocols of readers reading a reading test passage are presented. These examples show both the effective use of the strategies and some problems that may arise from their use. Imagine being confronted with the following task: From a limited set of data you are to build an exceedingly complex theory. Every step of the way you will encounter ambiguities. Partial theories will be necessary, but there is no way to be sure until the end that any partial theory can be incorporated into the final theory. Almost all of the possible theories you might consider are wrong, and yet, many of them will have ample supporting evidence. You will be given the data only bits at a time; thus, you may well be sent down what linguists call a "garden path" of misleading theories. You cannot be certain that there is a single theory that best accounts for the data. Even the best theory you find may leave some data unaccounted for. You are to do theory-constructing as you gather the data. The time allotted for the task is vanishingly small, no more than the time it has taken you to read this description of it. Faced with such a task, a reasonable person might well turn his or her thoughts elsewhere, perhaps to the ballgame outside the window; and that is what many children do when they are given the task of reading. The fact is that reading is a task with all the properties described above: The reader must build a complex theory from limited data in a short time. The data arrive pieces at a time as the eye moves across the page. Reading at a normal pace introduces ambiguity at even the lowest level because the reader can only sample from the text. This ambiguity is magnified at the level of words and sentences. Other ambiguities arise at the higher structural levels. Theories to account for the meaning of parts of the text proliferate because the data are limited and ambiguous, and the theories can become increasingly complex as the reader tries to account for larger portions of the text. Knowledge of the world, the prior text, the author, and the purpose of reading all need to be incorporated into the theories the reader builds, but this knowledge complicates the theories further. In this paper, we look at the process a reader must use to cope with difficulties of the kinds just described. Essentially, we view reading as a process of forming and evaluating hypotheses to account for the data in the text, and we discuss the central importance in this process of four strategies for controlling the proliferation of hypotheses. The view presented is not unique; what is different is our attempt to draw out the unforeseen implications and consequences of such a view. By taking the notion of "controlling hypothesis formation" seriously, identifying specific strategies and working through and extended example, we describe in more detail the comprehension process when it works and when it goes awry. Our analysis does not lead to prescriptions of specific instructional methods for reading comprehension. Instead, we hope to provide a concrete reference for teachers of reading to a perspective on the comprehension process which emphasizes that miscomprehension can be due to good strategies missing the mark. that even a seemingly straightforward text can lead to a large number of varying interpretations when it is read by a group of different readers. that a choice between two substantially different interpretations can result from a relatively small decision in the comprehension process. Our analysis includes both a general discussion of the process of answering questions about a reading test passage and examples from several protocols of students discussing the text. The examples both provide empirical support for the general approach and make the theory more accessible to teachers, students and classrooms. Hypothesis Formation and Evaluation There is one rather obvious way to cope with a task of the difficulty described above: Collect as much knowledge as is possible and apply it at every step of the hypothesis formation process. Such knowledge is of various types. First, a reader needs knowledge of structures at the levels of letter features, letters, words, sentences, and even whole texts. She or he also needs knowledge of the meaning of these structures, such as the fact that in the passive voice construction the object of an action is in the syntactic subject position. Perhaps most importantly, the reader needs pragmatic knowledge -knowledge about the use of language. Included in this last category are knowledge of facts about the world, knowledge of the author, knowledge of the time and place of the writing and reading of the text, knowledge of the task, and knowledge of one's own knowledge and abilities (Brown, 1980). Discussions of the knowledge needed for reading can be found in Adams and Bruce (1980), Rumelhart (1977), Olson, Duffy, and Mack (in press), and Spiro, Bruce, and Brewer (1980). @Comment{} Essential among the types of knowledge needed for reading is strategic knowledge, that is, knowledge about how to use each of the above knowledge sources. Coordinating them is a complex task, as there is increasing evidence that knowledge sources interact in a heterarchical fashion; that is, although they may naturally form a knowledge hierarchy running from orthographic knowledge to expectations about overall text structure, communication is not limited to adjacent members of the hierarchy. Earlier models of reading postulated less complicated mechanisms. The scenario proposed by Gough (1972) and LaBerge and Samuels (1974), for example, involved a visual input being processed sequentially at various knowledge levels, and arriving, finally, at a "meaning." More current models involve each knowledge source putting in its "two-cents' worth" at various points in the progression to comprehension of the text (Rumelhart, 1977). In viewing reading as a hypothesis-driven process (Rubin, Note 1) we define a hypothesis as a central structure which collects evidence for a particular interpretation of a text. Two general characteristics of hypotheses are important to mention here. First, a hypothesis represents a possible interpretation which may later either be proven or disproven. At various points during the reading process it may be in a state of limbo, only partially specified, needing more evidence, or perhaps even uncertain because of conflicting evidence. As a consequence of additional information, the reader may later have to "back up" and re-hypothesize about the meaning of a portion of the text. A second characteristic is that part of the structure of a hypothesis is the specification of those pieces of evidence which support or contradict it. A piece of evidence can even be another hypothesis. Hypotheses are then linked together in a network of "supporting" and "contradicting" relations. Several existing reading theories share significant properties with the general form described here (although they differ in important details). Goodman (1973) describes receptive language processes in general as hypothesis-based, defining them as "cycles of sampling, predicting, testing and confirming." He recognizes three levels of cues which readers use: graphemic, syntactic, and semantic; these cue systems are used "simultaneously and interdependently." Productive reading is seen as requiring strategies which facilitate the selection of the most useful cues. Smith (1973) emphasizes the contribution of what he terms "nonvisual" information to reading. This nonvisual knowledge includes what people already know about reading, language, and the world in general. He argues particularly that reading is not decoding to sound, but rather that semantic and other nonvisual processes intercede between visual processes and reading aloud. A different approach, which nevertheless assumes a hypothesis-based process is that of Perfetti (Note 2). He suggests ways in which the various component processes might interact, basing his overall conclusions on the fact that all the processes which occur during reading comprehension must share a "limited capacity processor." The limited-capacity processor view suggests a potential problem in the use of knowledge for reading comprehension: Although different types of knowledge are needed to evaluate hypotheses, each chunk of knowledge may also aid in the construction of new hypotheses. Thus, evaluation and, hence, elimination, of hypotheses vies with new hypothesis formation in determining the size of the hypothesis space. What is needed are strategies for controlling the proliferation of hypotheses. Details of such strategies have been discussed elsewhere (Collins, Brown, & Larkin, 1980; Erman, Hayes-Roth, Lesser, & Reddy, 1980; Woods, 1980; Rubin, Note 1). The point we will make here, however, is that strategies that cut down the number of hypotheses for consideration have other, qualitative effects, as well. We assume that these strategies operate within a process that maintains many hypotheses at once, but actively works on only a few at any one time. New hypotheses are spawned from the ones under active consideration. Thus, a strategy for focussing attention on one hypothesis out of a set of competing hypotheses (or choice set, [Rubin, Note 1]) would limit the number and type of new hypotheses that are generated. We have identified four such strategies: jumping to conclusions (choosing one hypothesis out of a choice set and focusing on it despite ins