Classification with Positive and Negative Equivalence Constraints: Theory, Computation and Human Experiments

We tested the efficiency of category learning when participants are provided only with pairs of objects, known to belong either to the same class (Positive Equivalence Constraints or PECs) or to different classes (Negative Equivalence Constraints or NECs). Our results in a series of cognitive experiments show dramatic differences in the usability of these two information building blocks, even when they are chosen to contain the same amount of information. Specifically, PECs seem to be used intuitively and quite efficiently, while people are rarely able to gain much information from NECs (unless they are specifically directed for the best way of using them). Tests with a constrained EM clustering algorithm under similar conditions also show superior performance with PECs. We conclude with a theoretical analysis, showing (by analogy to graph cut problems) that the satisfaction of NECs is computationally intractable, whereas the satisfaction of PECs is straightforward. Furthermore, we show that PECs convey more information than NECs by relating their information content to the number of different graph colorings. These inherent differences between PECs and NECs may explain why people readily use PECs, while many of them need specific directions to be able to use NECs effectively.

[1]  Rubi Hammer,et al.  Category learning from equivalence constraints , 2009, Cognitive Processing.

[2]  Tomer Hertz,et al.  Boosting margin based distance functions for clustering , 2004, ICML.

[3]  Raymond J. Mooney,et al.  Integrating constraints and metric learning in semi-supervised clustering , 2004, ICML.

[4]  Tomer Hertz,et al.  Computing Gaussian Mixture Models with EM Using Equivalence Constraints , 2003, NIPS.

[5]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[6]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[7]  Dan Klein,et al.  From Instance-level Constraints to Space-Level Constraints: Making the Most of Prior Knowledge in Data Clustering , 2002, ICML.

[8]  Tomer Hertz,et al.  Learning Distance Functions using Equivalence Relations , 2003, ICML.

[9]  Michael Krivelevich Sparse Graphs Usually Have Exponentially Many Optimal Colorings , 2002, Electron. J. Comb..

[10]  J. Grier,et al.  Nonparametric indexes for sensitivity and bias: computing formulas. , 1971, Psychological bulletin.

[11]  Pat Langley,et al.  Editorial: On Machine Learning , 1986, Machine Learning.

[12]  H Stanislaw,et al.  Calculation of signal detection theory measures , 1999, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[13]  Robert L. Goldstone Influences of categorization on perceptual discrimination. , 1994, Journal of experimental psychology. General.

[14]  David L. Faigman,et al.  Human category learning. , 2005, Annual review of psychology.

[15]  R M Nosofsky,et al.  An exemplar-retrieval model of speeded same--different judgments. , 2000, Journal of experimental psychology. Human perception and performance.

[16]  Nathan Linial,et al.  On the Hardness of Approximating the Chromatic Number , 2000, Comb..

[17]  Claire Cardie,et al.  Proceedings of the Eighteenth International Conference on Machine Learning, 2001, p. 577–584. Constrained K-means Clustering with Background Knowledge , 2022 .

[18]  David G. Stork,et al.  Pattern Classification , 1973 .

[19]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.