Developing and Validating an Instrument to Measure College Students' Inferential Reasoning in Statistics: An Argument-Based Approach to Validation

University of Minnesota Ph.D. dissertation. June 2012. Major: Educational Psychology. Advisor: Robert delMas. 1 computer file (PDF); xiii, 281 pages, appendices A-J.

[1]  Michael T. Kane,et al.  Validating High-Stakes Testing Programs , 2005 .

[2]  R. Rosenthal,et al.  Statistical Procedures and the Justification of Knowledge in Psychological Science , 1989 .

[3]  F. Samejima Estimation of latent ability using a response pattern of graded scores , 1968 .

[4]  David R. Cox,et al.  FREQUENTIST AND BAYESIAN STATISTICS: A CRITIQUE (KEYNOTE ADDRESS) , 2006 .

[5]  Bruce Thompson Asking “What if” Questions About Significance Tests , 1989 .

[6]  Patrick W Thompson,et al.  Exploring Connections between Sampling Distributions and Statistical Inference: an Analysis of Students’ Engagement and Thinking in the Context of Instruction Involving Repeated Sampling , 2007, International Electronic Journal of Mathematics Education.

[7]  A. Greenwald Consequences of Prejudice Against the Null Hypothesis , 1975 .

[8]  Arthur Bakker,et al.  LEARNING TO REASON ABOUT DISTRIBUTION , 2004 .

[9]  M. Alejandra Sorto IDENTIFYING CONTENT KNOWLEDGE FOR TEACHING STATISTICS , 2006 .

[10]  Beth Chance,et al.  INTRODUCING CONCEPTS OF STATISTICAL INFERENCE VIA RANDOMIZATION TESTS , 2010 .

[11]  B. Muthén,et al.  Robust inference using weighted least squares and quadratic estimating equations in latent variable modeling with categorical and continuous outcomes , 1997 .

[12]  Angustias Vallecillos,et al.  SOME EMPIRICAL EVIDENCES ON LEARNING DIFFICULTIES ABOUT TESTING HYPOTHESES , 1999 .

[13]  Mark D. Reckase,et al.  The Difficulty of Test Items That Measure More Than One Ability , 1985 .

[14]  Ruma Falk Misconceptions of statistical significance. , 1986 .

[15]  Ronald Christensen,et al.  Testing Fisher, Neyman, Pearson, and Bayes , 2005 .

[16]  Howard Wainer,et al.  Shaping Up the Practice of Null Hypothesis Significance Testing , 2003 .

[17]  Howard Wainer,et al.  How Is Reliability Related to the Quality of Test Scores? What Is the Effect of Local Dependence on Reliability? , 1998 .

[18]  Mark D. Reckase,et al.  TECHNICAL GUIDELINES FOR ASSESSING COMPUTERIZED ADAPTIVE TESTS , 1984 .

[19]  R. J. Ayala The Influence of Dimensionality on Estimation in the Partial Credit Model. , 1995 .

[20]  B. Thompson,et al.  Research news and Comment: A National Survey of AERA Members’ Perceptions of Statistical Significance Tests and Other Statistical Issues , 2000 .

[21]  A. Tversky,et al.  Judgment under Uncertainty: Heuristics and Biases , 1974, Science.

[22]  Susan Jo Russell,et al.  Children's Concepts of Average and Representativeness. , 1995 .

[23]  Larry G. Daniel,et al.  Statistical Significance Testing: A Historical Overview of Misuse and Misinterpretation with Implications for the Editorial Policies of Educational Journals , 1998 .

[24]  D. Thissen,et al.  Local Dependence Indexes for Item Pairs Using Item Response Theory , 1997 .

[25]  Alexander Pollatsek,et al.  Inconsistencies in Students' Reasoning about Probability. , 1993 .

[26]  Stephen G. Sireci,et al.  ON THE RELIABILITY OF TESTLET‐BASED TESTS , 1991 .

[27]  George W Cobb The Introductory Statistics Course: A Ptolemaic Curriculum? - eScholarship , 2007 .

[28]  L. Cronbach Five perspectives on the validity argument. , 1988 .

[29]  Zhihua Tang,et al.  Effectiveness of Simulation Training on Transfer of Statistical Concepts , 2000 .

[30]  Wim Van Den Noortgate,et al.  Students’ misconceptions of statistical inference: A review of the empirical evidence from research on statistics education , 2007 .

[31]  William Stout,et al.  A nonparametric approach for assessing latent trait unidimensionality , 1987 .

[33]  D. Ben-Zvi How do primary school students begin to reason about distributions , 2005 .

[34]  Jennifer J. Kaplan Effect of Belief Bias on the Development of Undergraduate Students' Reasoning about Inference , 2009 .

[35]  Joan Garfield,et al.  Developing Students’ Statistical Reasoning: Connecting Research and Teaching Practice , 2008 .

[36]  K. Lipson The role of the sampling distribution in understanding statistical inference , 2003 .

[37]  M. Browne,et al.  Alternative Ways of Assessing Model Fit , 1992 .

[38]  Andee Rubin,et al.  EXPLORING INFORMAL INFERENCE WITH INTERACTIVE VISUALIZATION SOFTWARE , 2006 .

[39]  S. Ohlsson Trace Analysis and Spatial Reasoning: An Example of Intensive Cognitive Diagnosis and Its Implications for Testing. September 1987. Technical Report. , 1987 .

[40]  Yan Liu,et al.  EXPLORING STUDENTS’ CONCEPTIONS OF THE STANDARD DEVIATION , 2005 .

[41]  M. L. Lunsford,et al.  Classroom Research: Assessment of Student Understanding of Sampling Distributions of Means and the Central Limit Theorem in Post-Calculus Probability and Statistics Classes , 2006 .

[42]  P. Thompson,et al.  INVESTIGATING STATISTICAL UNUSUALNESS IN THE CONTEXT OF A RESAMPLING ACTIVITY: STUDENTS EXPLORING CONNECTIONS BETWEEN SAMPLING DISTRIBUTIONS AND STATISTICAL INFERENCE , 2006 .

[43]  Ross E. Traub,et al.  On the Equivalence of Constructed- Response and Multiple-Choice Tests , 1977 .

[44]  R. Nickerson,et al.  Null hypothesis significance testing: a review of an old and continuing controversy. , 2000, Psychological methods.

[45]  Aaron Weinberg,et al.  Do Hands-on Activities Increase Student Understanding?: A Case Study , 2009 .

[46]  Stephen G. Sireci,et al.  On Validity Theory and Test Validation , 2007 .

[47]  David Thissen,et al.  Trace Lines for Testlets: A Use of Multiple-Categorical-Response Models. , 1989 .

[48]  HOW SIGNIFICANCE TESTS SHOULD BE PRESENTED TO AVOID THE TYPICAL MISINTERPRETATIONS , 2002 .

[49]  Allan J. Rossman,et al.  Teacher's Corner Sequencing Topics in Introductory Statistics: A Debate on What to Teach When , 2001 .

[50]  R. J. Ayala The Influence of Multidimensionality on the Graded Response Model. , 1994 .

[51]  Andee Rubin,et al.  A FRAMEWORK FOR THINKING ABOUT INFORMAL STATISTICAL INFERENCE , 2009, STATISTICS EDUCATION RESEARCH JOURNAL.

[52]  Kevin F. Collis,et al.  Evaluating the Quality of Learning: The SOLO Taxonomy , 1977 .

[53]  H. Wainer,et al.  Are Tests Comprising Both Multiple‐Choice and Free‐Response Items Necessarily Less Unidimensional Than Multiple‐Choice Tests?An Analysis of Two Tests , 1994 .

[54]  Beth Chance,et al.  ASSESSING STUDENTS’ CONCEPTUAL UNDERSTANDING AFTER A FIRST COURSE IN STATISTICS , 2007 .

[55]  Michael T. Kane,et al.  An argument-based approach to validity. , 1992 .

[56]  Wendy M. Yen,et al.  Scaling Performance Assessments: Strategies for Managing Local Item Dependence , 1993 .

[57]  F. Schmidt Statistical Significance Testing and Cumulative Knowledge in Psychology: Implications for Training of Researchers , 1996 .

[58]  P. Bentler,et al.  Cutoff criteria for fit indexes in covariance structure analysis : Conventional criteria versus new alternatives , 1999 .

[59]  Glen T. Cameron,et al.  Instructor’s Manual and Test Bank , 2010 .

[60]  Pawel Kalinowski IDENTIFYING MISCONCEPTIONS ABOUT CONFIDENCE INTERVALS , 2010 .

[61]  Michael C. Rodriguez Construct Equivalence of Multiple-Choice and Constructed-Response Items: A Random Effects Synthesis of Correlations , 2003 .

[62]  Heiko Haller,et al.  Misinterpretations of significance: A problem students share with their teachers? , 2002 .

[63]  THE INFLUENCE OF PRESENTATION ON THE INTERPRETATION OF INFERENTIAL RESULTS , 2010 .

[64]  Anne M. Williams,et al.  NOVICE STUDENTS' CONCEPTUAL KNOWLEDGE OF STATISTICAL HYPOTHESIS TESTING , 1999 .

[65]  J. H. Steiger,et al.  The Problem Is Epistemology, Not Statistics: Replace Significance Tests by Confidence Intervals and Quantify Accuracy of Risky Numerical Predictions , 2002 .

[66]  John Mason,et al.  LOCAL AND GLOBAL THINKING IN STATISTICAL INFERENCE , 2008 .

[67]  R. Hoyle The structural equation modeling approach: Basic concepts and fundamental issues. , 1995 .

[68]  Kathleen Cage Mittag,et al.  EFFECT OF CALCULATOR TECHNOLOGY ON STUDENT ACHIEVEMENT IN AN INTRODUCTORY STATISTICS COURSE , 2005 .

[69]  Jacob Cohen The earth is round (p < .05) , 1994 .

[70]  Robert L. Brennan,et al.  An Essay on the History and Future of Reliability from the Perspective of Replications , 2001 .

[71]  Maria Meletiou-Mavrotheris Technological Tools in the Introductory Statistics Classroom: Effects on Student Understanding of Inferential Statistics , 2003, Int. J. Comput. Math. Learn..

[72]  Michael E. Martinez Cognition and the question of test item format , 1999 .

[73]  Christopher L. Aberson,et al.  Evaluation of an Interactive Tutorial for Teaching the Central Limit Theorem , 2000 .

[74]  Matthew Regan,et al.  Towards more accessible conceptions of statistical inference , 2011 .

[75]  Gerd Gigerenzer,et al.  The "conjunction fallacy" revisited : How intelligent inferences look like reasoning errors , 1999 .

[76]  Liliana Tauber,et al.  Students’ Reasoning about the Normal Distribution , 2004 .

[77]  Student BELIEF IN THE LAW OF SMALL NUMBERS , 1994 .

[78]  Roger E. Kirk,et al.  Promoting Good Statistical Practices: Some Suggestions , 2001 .

[79]  Kay Lipson,et al.  Investigation of students' experiences with a web-based computer simulation , 2003 .

[80]  T. Haladyna,et al.  Content-Related Validity Evidence in Test Development , 2011 .

[81]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[82]  G. Gigerenzer,et al.  Do studies of statistical power have an effect on the power of studies , 1989 .

[83]  J F Fries,et al.  The promise of PROMIS: using item response theory to improve assessment of patient-reported outcomes. , 2005, Clinical and experimental rheumatology.

[84]  John Beatty,et al.  The Empire of Chance: How Probability Changed Science and Everyday Life , 1989 .

[85]  B. Thompson Research news and Comment: AERA Editorial Policies Regarding Statistical Significance Testing: Three Suggested Reforms , 1996 .

[86]  Joan Garfield,et al.  A Model of Classroom Research in Action: Developing Simulation Activities to Improve Students' Statistical Reasoning , 1999, Journal of Statistics Education.

[87]  G. Cumming,et al.  Researchers misunderstand confidence intervals and standard error bars. , 2005, Psychological methods.

[88]  J. F. Voss,et al.  Who Reasons Well? Two Studies of Informal Reasoning Among Children of Different Grade, Ability, and Knowledge Levels , 1996 .

[89]  R. Falk,et al.  Significance Tests Die Hard , 1995 .

[90]  Peter F. Halpin,et al.  Inductive inference or inductive behavior: Fisher and Neyman-Pearson approaches to statistical testing in psychological research (1940-1960). , 2006, The American journal of psychology.

[91]  Allan S. Cohen,et al.  Validating Measures of Performance , 2005 .

[92]  Hollylynne Stohl,et al.  Developing notions of inference using probability simulation tools , 2002 .

[93]  Jane Watson,et al.  Developing Concepts of Sampling , 2000 .

[94]  E. Sowey From a logical point of view: an illuminating perspective in teaching statistical inference , 2005 .

[95]  Carmen Batanero,et al.  Controversies Around the Role of Statistical Tests in Experimental Research , 2000 .

[96]  Carol A. Chapelle,et al.  test Score Interpretation And Use , 2011 .

[97]  D. Thwaites CHAPTER 12 , 1999 .

[98]  Kay Lipson,et al.  THE ROLE OF COMPUTER BASED TECHNOLOGY IN DEVELOPING UNDERSTANDING OF THE CONCEPT OF SAMPLING DISTRIBUTION , 2002 .

[99]  D. Ben-Zvi,et al.  THE ROLE OF CONTEXT IN THE DEVELOPMENT OF STUDENTS' INFORMAL INFERENTIAL REASONING , 2010 .

[100]  Nancy C. Lavigne,et al.  Exploring College Students' Mental Representations of Inferential Statistics. , 2008 .