Decision criteria for model comparison using the parametric bootstrap cross-fitting method

When computational cognitive models are compared regarding their ability to fit empirical data, it is important to consider the models' complexity. The parametric bootstrap cross-fitting method (PBCM, Wagenmakers, Ratcliff, Gomez, & Iverson, 2004) is a promising approach to model comparison and selection that takes the compared models' complexity into account. Applying the PBCM requires solving a classification problem, in which it needs to be determined whether a goodness of fit value generated from the compared models is more likely under one or the other of two existing distributions. Previous literature on the PBCM provides little explicit information on (a) the properties of the distributions one should expect to arise in the scope of the PBCM or (b) which methods for solving the classification problem may be suitable (in which situations). This lack of information may hamper use of the PBCM by cognitive modelers. As part of our general endeavor to make sophisticated modeling methods more available and accessible to cognitive scientists developing computational models, in this article we provide detailed analyses of both the distributions that can be expected to arise when employing the PBCM and the performance characteristics of 8 classification methods. Simulation studies involving 6 artificial pairs of distributions and pairs of distributions arising from 8 pairs of existing cognitive models indicate (a) that the relative location but not the shape of the two distributions can be expected to be constrained and (b) that the k-nearest neighbor method constitutes a good general choice for solving the classification problem.

[1]  David G. Stork,et al.  Pattern Classification , 1973 .

[2]  John R. Anderson,et al.  Learning Artificial Grammars With Competitive Chunking , 1990 .

[3]  Michael D. Lee,et al.  A Survey of Model Evaluation Approaches With a Tutorial on Hierarchical Bayesian Methods , 2008, Cogn. Sci..

[4]  Isabel Fraga,et al.  In Masked Nonword Repetition Effects in Yes/no and Go/no-go Lexical Decision: a Test of the Evidence Accumulation and Deadline Accounts , 2022 .

[5]  Holger Schultheis,et al.  Multi-Model Comparison Using the Cross-Fitting Method , 2014, CogSci.

[6]  Roger Ratcliff,et al.  Assessing model mimicry using the parametric bootstrap , 2004 .

[7]  A. Vinter,et al.  PARSER: A Model for Word Segmentation , 1998 .

[8]  C. Rotello,et al.  Evaluating models of remember-know judgments: Complexity, mimicry, and discriminability , 2008, Psychonomic bulletin & review.

[9]  Neal Madras Lectures on Monte Carlo Methods , 2002 .

[10]  I. J. Myung,et al.  When a good fit can be bad , 2002, Trends in Cognitive Sciences.

[11]  D. Vickers,et al.  Evidence for an accumulator model of psychophysical discrimination. , 1970, Ergonomics.

[12]  P. Perruchet,et al.  Implicit learning and statistical learning: one phenomenon, two approaches , 2006, Trends in Cognitive Sciences.

[13]  H Pashler,et al.  How persuasive is a good fit? A comment on theory testing. , 2000, Psychological review.

[14]  James L. McClelland,et al.  The time course of perceptual choice: the leaky, competing accumulator model. , 2001, Psychological review.

[15]  J. Wixted,et al.  The diagnosticity of individual data for model selection: Comparing signal-detection models of recognition memory , 2011, Psychonomic bulletin & review.

[16]  G. Miller,et al.  Free recall of redundant strings of letters. , 1958, Journal of experimental psychology.

[17]  H. B. Mann,et al.  On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other , 1947 .

[18]  Holger Schultheis,et al.  Comparing Model Comparison Methods , 2013, CogSci.

[19]  Adam N. Sanborn,et al.  Model evaluation using grouped or individual data , 2008, Psychonomic bulletin & review.