REASONING ABOUT DISTRIBUTION: A COMPLEX PROCESS 2

which presents research at the forefront of building conceptual foundations for statistics education. According to Moore (1990, p. 136) statistical thinking is an “independent and fundamental intellectual method that deserves attention in the school curriculum.” Equally he could have stated that statistical thinking deserves attention by research. He also hoped that “in the future pupils will bring away from their schooling a structure of thought that whispers, ‘Variation matters … Why not draw a graph?’” (Moore, 1991, p. 426). With considerable foresight Moore not only encapsulated the building blocks for statistical thinking but also two deep research questions with which statistics education researchers are currently grappling: How do students actually reason about variability and distribution? How do these two types of reasoning develop? Variation is at the heart of statistical thinking but the reasoning about variation is enabled through diagrams or displays that “represent intuitively the original reality via an intervening conceptual structure” (Fischbein, 1987, p. 165), such as graphs or frequency distributions of data. The conceptualization of variation “through a lens, which is ‘distribution’” (Wild, 2005) was originally fostered by Quetelet in the 1840s (Porter, 1986). Connecting variation in nature to distribution structures was a major conceptual obstacle in the history of statistics. It was not until the end of the 19th Century that the astronomers’ error curve was re-conceptualized as a distribution governing variation in social data. According to Bakker and Gravemeijer (2004) distribution is the conceptual entity for thinking about variability in data. Therefore a discussion about the nature of distributions involves both conceptual and operational aspects to be considered. A conceptual perspective focuses on clarifying what notions underpin distributions and why these notions are important whereas an operational perspective focuses on how a specific set of data is captured, displayed and manipulated by distributions. Reasoning about distributions involves interpreting a complex structure that not only includes reasoning about features such as centre, spread, density, skewness, and outliers but also involves other ideas such as sampling, population, causality and chance. These other ideas lead towards connecting empirical data with probabilistic notions, which in turn develop cognizance of empirical and theoretical distributions. In fact Bakker and Gravemeijer (2004), in the context of data analysis, believe that focusing on distribution might bring