Mixed Membership Distributions with Applications to Modeling Multiple Strategy Usage.

This dissertation examines two related questions. How do mixed membership models work? and Can mixed membership be used to model how students use multiple strategies to solve problems? Mixed membership models have been used in thousands of applications from text and image processing to genetic microarray analysis. Yet these models are crafted on a case-by-case basis because we do not yet understand the larger class of mixed membership models. The work presented here addresses this gap and examines two different aspects of the general class of models. First I establish that categorical data is a special case, and allows for a different interpretation of mixed membership than in the general case. Second, I present a new identifiability result that characterizes equivalence classes of mixed membership models which produce the same distribution of data. These results provide a strong foundation for building a model that captures how students use multiple strategies. How to assess which strategies students use, is an open question. Most psychometric models either do not model strategies at all, or they assume that each student uses a single strategy on all problems, even if they allow different students to use different strategies. The problem is, that’s not what students do. Students switch strategies. Even on the very simplest of arithmetic problems, students use

[1]  C. Quaiser-Pohl,et al.  The Solution Strategy as an Indicator of the Developmental Stage of Preschool Children's Mental-Rotation Ability , 2010 .

[2]  Robert S Siegler,et al.  The powers of noise-fitting: reply to Barth and Paladino. , 2011, Developmental science.

[3]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[4]  Clarissa A. Thompson,et al.  The trouble with transfer: insights from microgenetic changes in the representation of numerical magnitude. , 2008, Child development.

[5]  Melissa E. Libertus,et al.  Comment on "Log or Linear? Distinct Intuitions of the Number Scale in Western and Amazonian Indigene Cultures" , 2009, Science.

[6]  K. Manton,et al.  Fuzzy set analyses of genetic determinants of health and disability status , 2004, Statistical methods in medical research.

[7]  Korbinian Moeller,et al.  Children's early mental number line: logarithmic or decomposed linear? , 2009, Journal of experimental child psychology.

[8]  van der Linden,et al.  A hierarchical framework for modeling speed and accuracy on test items , 2007 .

[9]  Ori Rosen,et al.  A Bayesian Model for Sparse Functional Data , 2008, Biometrics.

[10]  Julie L. Booth,et al.  Development of numerical estimation in young children. , 2004, Child development.

[11]  Robert S. Siegler,et al.  Development of Numerical Estimation: A Review , 2004 .

[12]  A. F. Smith,et al.  Statistical analysis of finite mixture distributions , 1986 .

[13]  Jeffrey N. Rouder,et al.  Are unshifted distributional models appropriate for response time? , 2005 .

[14]  J. Lafferty,et al.  Mixed-membership models of scientific publications , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[15]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.

[16]  Julie L. Booth,et al.  Developmental and individual differences in pure numerical estimation. , 2006, Developmental psychology.

[17]  M. Woodbury,et al.  Mathematical typology: a grade of membership technique for obtaining disease definition. , 1978, Computers and biomedical research, an international journal.

[18]  Jie Peng,et al.  Distance-based clustering of sparsely observed stochastic processes, with applications to online auctions , 2008, 0805.0463.

[19]  T. Tarpey Linear Transformations and the k-Means Clustering Algorithm , 2007, American Statistician.

[20]  Ata Kabán,et al.  Sequential Activity Profiling: Latent Dirichlet Allocation of Markov Chains , 2005, Data Mining and Knowledge Discovery.

[21]  B. Junker Some statistical models and computational methods that may be useful for cognitively-relevant assessment , 1999 .

[22]  Marsha C. Lovett,et al.  Cognotive Task Analysis in Service of Intelligent Tutoring System Design: A Case Study in Statistics , 1998, Intelligent Tutoring Systems.

[23]  Edward E. Roskam,et al.  Models for Speed and Time-Limit Tests , 1997 .

[24]  S. Fienberg,et al.  DESCRIBING DISABILITY THROUGH INDIVIDUAL-LEVEL MIXTURE MODELS FOR MULTIVARIATE BINARY DATA. , 2007, The annals of applied statistics.

[25]  Robert S. Siegler,et al.  Representational change and children’s numerical estimation , 2007, Cognitive Psychology.

[26]  W. D. Linden,et al.  Conceptual Issues in Response-Time Modeling. , 2009 .

[27]  B. Junker,et al.  Cognitive Assessment Models with Few Assumptions, and Connections with Nonparametric Item Response Theory , 2001 .

[28]  Kenneth R. Koedinger,et al.  Using Contextual Factors Analysis to Explain Transfer of Least Common Multiple Skills , 2011, AIED.

[29]  Elena A. Erosheva,et al.  Grade of membership and latent structure models with application to disability survey data , 2002 .

[30]  Michael J. Wenger,et al.  Models for the statistics and mechanisms of response speed and accuracy , 2005 .

[31]  Gerard J. P. Van Breukelen,et al.  Psychometric Modeling of response speed and accuracy with mixed and conditional regression , 2005 .

[32]  Robert J. Mislevy,et al.  Modeling item responses when different subjects employ different solution strategies , 1990 .

[33]  J. Aitchison A General Class of Distributions on the Simplex , 1985 .

[34]  Jeffrey S. Morris,et al.  AUTOMATED ANALYSIS OF QUANTITATIVE IMAGE DATA USING ISOMORPHIC FUNCTIONAL MIXED MODELS, WITH APPLICATION TO PROTEOMICS DATA. , 2011, The annals of applied statistics.

[35]  P. Donnelly,et al.  Inference of population structure using multilocus genotype data. , 2000, Genetics.

[36]  R. Siegler,et al.  The Development of Numerical Estimation , 2003, Psychological science.

[37]  J. Atchison,et al.  Logistic-normal distributions:Some properties and uses , 1980 .

[38]  Colin Campbell,et al.  The latent process decomposition of cDNA microarray data sets , 2005, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[39]  Daniel Manrique-Vallier,et al.  Longitudinal mixed membership models with applications to disability survey data , 2010 .

[40]  Elida V. Laski,et al.  Is 27 a big number? Correlational and causal connections among numerical categorization, number line estimation, and numerical magnitude comparison. , 2007, Child development.

[41]  John T. Willse,et al.  Defining a Family of Cognitive Diagnosis Models Using Log-Linear Models with Latent Variables , 2009 .

[42]  R. Siegler The perils of averaging data over strategies: An example from children's addition. , 1987 .

[43]  Julie L. Booth,et al.  Numerical magnitude representations influence arithmetic learning. , 2008, Child development.

[44]  Arindam Banerjee,et al.  Mixed-membership naive Bayes models , 2011, Data Mining and Knowledge Discovery.

[45]  Hilary C Barth,et al.  The development of numerical estimation: evidence against a representational shift. , 2011, Developmental science.

[46]  John Aitchison,et al.  The Statistical Analysis of Compositional Data , 1986 .

[47]  P. Onghena,et al.  The relationship between the shape of the mental number line and familiarity with numbers in 5- to 9-year old children: evidence for a segmented linear model. , 2008, Journal of experimental child psychology.

[48]  D. Gervini Robust functional estimation using the median and spherical principal components , 2008 .

[49]  Neil T. Heffernan,et al.  Addressing the assessment challenge with an online system that tutors as it assesses , 2009, User Modeling and User-Adapted Interaction.

[50]  Michael I. Jordan,et al.  Modeling annotated data , 2003, SIGIR.

[51]  D. Gervini Detecting and handling outlying trajectories in irregularly sampled functional datasets , 2010, 1011.0619.

[52]  John D. Lafferty,et al.  A correlated topic model of Science , 2007, 0708.3601.

[53]  Marco Zorzi,et al.  Numerical estimation in preschoolers. , 2010, Developmental psychology.

[54]  Ashutosh Kumar Singh,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2010 .

[55]  Michael Eid,et al.  Separating "Rotators" From "Nonrotators" in the Mental Rotations Test: A Multigroup Latent Class Analysis , 2006, Multivariate behavioral research.

[56]  John E. Opfer,et al.  Representational change and magnitude estimation: Why young children can make more accurate salary comparisons than adults , 2008, Cognition.

[57]  R. H. Klein Entink,et al.  A Multivariate Multilevel Approach to the Modeling of Accuracy and Speed of Test Takers , 2008, Psychometrika.

[58]  John R. Anderson Cognitive Psychology and Its Implications , 1980 .

[59]  R. Glaser,et al.  Knowing What Students Know: The Science and Design of Educational Assessment , 2001 .

[60]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[61]  Jeffrey N. Rouder,et al.  A hierarchical bayesian statistical framework for response time distributions , 2003 .

[62]  L. L. Thurstone,et al.  Ability, motivation, and speed , 1937 .

[63]  Albert T. Corbett,et al.  The Knowledge-Learning-Instruction (KLI) Framework: Toward Bridging the Science-Practice Chasm to Enhance Robust Student Learning , 2010 .

[64]  Robert S. Siegler,et al.  The Logarithmic-To-Linear Shift: One Learning Sequence, Many Tasks, Many Time Scales , 2009 .

[65]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[66]  B. Rittle-Johnson,et al.  Learning to spell: variability, choice, and change in children's strategy use. , 1999, Child development.