Conceptual complexity and the bias/variance tradeoff

In this paper we propose that the conventional dichotomy between exemplar-based and prototype-based models of concept learning is helpfully viewed as an instance of what is known in the statistical learning literature as the bias/variance tradeoff. The bias/variance tradeoff can be thought of as a sliding scale that modulates how closely any learning procedure adheres to its training data. At one end of the scale (high variance), models can entertain very complex hypotheses, allowing them to fit a wide variety of data very closely--but as a result can generalize poorly, a phenomenon called overfitting. At the other end of the scale (high bias), models make relatively simple and inflexible assumptions, and as a result may fit the data poorly, called underfitting. Exemplar and prototype models of category formation are at opposite ends of this scale: prototype models are highly biased, in that they assume a simple, standard conceptual form (the prototype), while exemplar models have very little bias but high variance, allowing them to fit virtually any combination of training data. We investigated human learners' position on this spectrum by confronting them with category structures at variable levels of intrinsic complexity, ranging from simple prototype-like categories to much more complex multimodal ones. The results show that human learners adopt an intermediate point on the bias/variance continuum, inconsistent with either of the poles occupied by most conventional approaches. We present a simple model that adjusts (regularizes) the complexity of its hypotheses in order to suit the training data, which fits the experimental data better than representative exemplar and prototype models.

[1]  R. Nosofsky Attention, similarity, and the identification-categorization relationship. , 1986, Journal of experimental psychology. General.

[2]  Nick Chater,et al.  A simplicity principle in unsupervised human categorization , 2002, Cogn. Sci..

[3]  Johanna D. Moore,et al.  Proceedings of the 28th Annual Conference of the Cognitive Science Society , 2005 .

[4]  J. D. Smith,et al.  Wanted: a new psychology of exemplars. , 2005, Canadian journal of experimental psychology = Revue canadienne de psychologie experimentale.

[5]  M. Posner,et al.  Perceived distance and the classification of distorted patterns. , 1967, Journal of experimental psychology.

[6]  David R. Anderson,et al.  Multimodel Inference , 2004 .

[7]  R. Shepard Stimulus and response generalization: A stochastic model relating generalization to distance in psychological space , 1957 .

[8]  R. Nosofsky,et al.  An exemplar-based random walk model of speeded classification. , 1997, Psychological review.

[9]  W. Vanpaemel,et al.  In search of abstraction: The varying abstraction model of categorization , 2008, Psychonomic bulletin & review.

[10]  Jacob Feldman,et al.  Categorization Under Complexity: A Unified MDL Account of Human Learning of Regular and Irregular Categories , 2002, NIPS.

[11]  J. Feldman An algebra of human concept learning , 2006 .

[12]  R. Nosofsky,et al.  Rule-plus-exception model of classification learning. , 1994, Psychological review.

[13]  Gregory Ashby,et al.  A neuropsychological theory of multiple systems in category learning. , 1998, Psychological review.

[14]  J. Feldman The Simplicity Principle in Human Concept Learning , 2003 .

[15]  S C McKinley,et al.  Investigations of exemplar and decision bound models in large, ill-defined category structures. , 1995, Journal of experimental psychology. Human perception and performance.

[16]  John R. Anderson,et al.  The Adaptive Nature of Human Categorization. , 1991 .

[17]  Thomas L. Griffiths,et al.  A Rational Analysis of Rule-Based Concept Learning , 2008, Cogn. Sci..

[18]  R. Nosofsky Relation between the Rational Model and the Context Model of Categorization , 1991 .

[19]  Jacob Feldman,et al.  Subjective Complexity of Categories Defined over Three-Valued Features , 2006 .

[20]  Mark K. Johansen,et al.  Exemplar-based accounts of "multiple-system" phenomena in perceptual categorization. , 2000, Psychonomic bulletin & review.

[21]  D. Luce,et al.  Detection and Recognition " ' , 2006 .

[22]  T. Verguts,et al.  Assessing the informational value of parameter estimates in cognitive models , 2004, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[23]  Daniel J. Navarro,et al.  From natural kinds to complex categories , 2006 .

[24]  R. Vogels,et al.  The representation of perceived shape similarity and its role for category learning in monkeys: A modeling study , 2008, Vision Research.

[25]  Daniel N. Osherson,et al.  Conceptual Combination with Prototype Concepts , 1984, Cogn. Sci..

[26]  R. Nosofsky Attention, similarity, and the identification-categorization relationship. , 1986 .

[27]  David Marr,et al.  VISION A Computational Investigation into the Human Representation and Processing of Visual Information , 2009 .

[28]  John R. Anderson,et al.  Rules of the Mind , 1993 .

[29]  John R. Anderson,et al.  A hybrid model of categorization , 2001, Psychonomic bulletin & review.

[30]  Mark Blair,et al.  As easy to memorize as they are to classify: The 5–4 categories and the category advantage , 2003, Memory & cognition.

[31]  Nick Chater,et al.  Simplicity and the mind , 1997 .

[32]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[33]  D. Medin,et al.  SUSTAIN: a network model of category learning. , 2004, Psychological review.

[34]  R. Shepard,et al.  Toward a universal law of generalization for psychological science. , 1987, Science.

[35]  F. Keil Constraints on knowledge and cognitive development. , 1981 .

[36]  N. Chater,et al.  Similarity and rules: distinct? exhaustive? empirically distinguishable? , 1998, Cognition.

[37]  E. Rosch,et al.  Cognition and Categorization , 1980 .

[38]  Thomas J. Palmeri,et al.  An Exemplar-Based Random Walk Model of Speeded Classification , 1997 .

[39]  Bernhard Schölkopf,et al.  A tutorial on kernel methods for categorization , 2007, Journal of Mathematical Psychology.

[40]  Joshua B. Tenenbaum,et al.  Bayesian Modeling of Human Concept Learning , 1998, NIPS.

[41]  David G. Stork,et al.  Pattern Classification , 1973 .

[42]  J. Kruschke,et al.  ALCOVE: an exemplar-based connectionist model of category learning. , 1992, Psychological review.

[43]  Ernest Lepore,et al.  The pet fish and the red herring: why concepts aren't prototypes , 1996 .

[44]  R. Nosofsky,et al.  Integrating information from separable psychological dimensions. , 1990, Journal of experimental psychology. Human perception and performance.

[45]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[46]  H. Akaike A new look at the statistical model identification , 1974 .

[47]  Geoffrey J. McLachlan,et al.  Mixture models : inference and applications to clustering , 1989 .

[48]  Alexander J. Smola,et al.  Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.

[49]  J. D. Smith,et al.  Prototypes in category learning: the effects of category size, category structure, and stimulus complexity. , 2001, Journal of experimental psychology. Learning, memory, and cognition.

[50]  J. Kruschke,et al.  Rules and exemplars in category learning. , 1998, Journal of experimental psychology. General.

[51]  Edward E. Smith,et al.  Alternative strategies of categorization , 1998, Cognition.

[52]  J. Fodor,et al.  The red herring and the pet fish: why concepts still can't be prototypes , 1996, Cognition.

[53]  Ashby Fg,et al.  Integrating information from separable psychological dimensions. , 1990 .

[54]  Douglas L. Medin,et al.  Context theory of classification learning. , 1978 .

[55]  J. D. Smith,et al.  Prototypes in the Mist: The Early Epochs of Category Learning , 1998 .

[56]  M. Posner,et al.  On the genesis of abstract ideas. , 1968, Journal of experimental psychology.

[57]  R. Nosofsky Attention and learning processes in the identification and categorization of integral stimuli. , 1987, Journal of experimental psychology. Learning, memory, and cognition.

[58]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[59]  Eleanor Rosch,et al.  Principles of Categorization , 1978 .

[60]  Edward E. Smith,et al.  Categories and concepts , 1984 .

[61]  Y. Rosseel Mixture models of categorization , 2002 .

[62]  H. Kamp,et al.  Prototype theory and compositionality , 1995, Cognition.

[63]  Bart Ons,et al.  A varying abstraction model for categorization , 2005 .

[64]  Joshua B. Tenenbaum,et al.  Learning annotated hierarchies from relational data , 2006, NIPS.

[65]  Jacob Feldman,et al.  Minimization of Boolean complexity in human concept learning , 2000, Nature.

[66]  D. Gentner,et al.  Similarity and the development of rules , 1998, Cognition.

[67]  M. C. Jones,et al.  Variable location and scale kernel density estimation , 1994 .

[68]  Tom Verguts,et al.  Varying abstraction in categorization A K - means approach , 2005 .

[69]  Eugene Galanter,et al.  Handbook of mathematical psychology: I. , 1963 .

[70]  F. Ashby,et al.  Categorization as probability density estimation , 1995 .

[71]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[72]  E. Rosch,et al.  Family resemblances: Studies in the internal structure of categories , 1975, Cognitive Psychology.

[73]  J. Franklin,et al.  The elements of statistical learning: data mining, inference and prediction , 2005 .

[74]  Adam N. Sanborn,et al.  Categorization as nonparametric Bayesian density estimation , 2008 .