Report Fragment-Based Learning of Visual Object Categories

Department of Electrical EngineeringCalifornia Institute of TechnologyPasadena, California 91125SummaryWhen we perceive a visual object, we implicitly or explicitlyassociate it with a category we know [1–3]. It is known thatthevisualsystemcanuselocal,informativeimagefragmentsof a given object, rather than the whole object, to classify itinto a familiar category [4–8]. How we acquire informativefragments has remained unclear. Here, we show that humanobservers acquire informative fragments during the initiallearningofcategories.Wecreatednew,butnaturalistic,classesof visual objects by using a novel ‘‘virtual phylogenesis’’ (VP)algorithm that simulates key aspects of how biological cate-gories evolve. Subjects were trained to distinguish two oftheseclassesbyusingwholeexemplarobjects,notfragments.We hypothesized that if the visual system learns informativeobject fragments during category learning, then subjectsmust be able to perform the newly learned categorizationby using only the fragments as opposed to whole objects.We found that subjects were able to successfully performthe classification task by using each of the informative frag-ments by itself, but not by using any of the comparable, butuninformative, fragments. Our results not only reveal thatnovel categories can be learned by discovering informativefragments but also introduce and illustrate the use of VP asa versatile tool for category-learning research.ResultsUsing VP to Create Shape ClassesThe VP algorithm generates naturalistic object categories byemulating biological phylogenesis (see Supplemental Dataavailable online). With VP, we created three classes of novelobjects, classes A, B, and C and used 200 exemplars fromeach (Figure 1). Note that the three classes are very similarto each other, so that distinguishing among them is nontrivial(seebelowandFigureS1).Moreover,notwoobjects,includingobjects within a given category, were exactly alike, so thatdistinguishing among them required learning the relevantstatistical properties of the objects and ignoring the irrelevantvariations. Finally, note that the differences between cate-gories arose spontaneously and randomly during VP, ratherthan as a result of externally imposed rules.Extracting Informative FragmentsWe isolated ten fragments (‘‘Main’’ fragments, Figures 2A and2B) that were highly informative for distinguishing class A fromclass B (the main task in experiment 1, see Supplemental Datafor details). We also isolated ten ‘‘Control’’ fragments (Figures2C and 2D) and ten ‘‘IPControl’’ fragments (Figure S2) thatwere uninformative for the main task but visually comparableto the main fragments. The mutual information (MI) value ofa given fragment quantifies the information it conveys aboutagivencategory.Thehigherthefragment’sMI,themoreusefulthe fragment is for categorization. The MI values of allfragments used in this study are listed in Supplemental Data.Testing the Informativeness of Individual FragmentsThe experiments consisted of training the subjects on wholeobjects and then testing them on fragments. Because onlywhole objects, not fragments, were used during training,subjects were not aware of the fragments or required to learnthem. After the subjects were trained in the task, we tested theextent to which subjects were able to perform the classifica-tion task by using the fragments, each presented individually(see Figure 3 and Supplemental Data). We hypothesized thatif the subjects learned informative object fragments duringthe training, then the subjects must be able to perform thecategorization task by using the individual main fragments,but not the control fragments.The observed performance closely matched these predic-tions. Figure 4A shows the average performance of sixsubjects using the main fragments. Subjects performed signifi-cantlyabovechancewitheachofthefragments(binomialtests,p 0.05, data not shown). The only exception to thiswas the performance of one subject with main fragment #9, forwhich she classified the object containing the fragment as A inonly1/16(6.25%)ofthetrials(alsoseebelow).Altogether,theseresultsindicatethatthesubjectswereabletocategorizetheob-jects on the basis of each of the fragments alone and that theperformancewiththefragmentswasgenerallyindistinguishablefrom the performance of the subjects with the whole object.Bycontrast,subjectswereunabletoperformthetaskabovechance levels by using any of the control or IPControl frag-ments (Figures 4B and 4C; binomial tests, p > 0.05). That is,subjects were about equally likely to classify an object asbelonging to class A or class B on the basis of a given controlor IPControl fragment. Thus, although all three types of frag-mentsbelongedtoclassA,onlythemainfragmentswerelikelyto be assigned to class A.To ensure that above results were not a function of a fortu-itous designation of object classes, we performed experiment2 in which we repeated the design of experiment 1, but witha different set of class designations, whereby the main taskwas to distinguish class C from class B (see Figure S4). Adifferent set of four subjects participated in this experiment.The results of this experiment were similar to those in experi-ment 1 (Figure S5).Additional analyses indicated the performance showed no

[1]  David G. Stork,et al.  Pattern Classification , 1973 .

[2]  Shimon Ullman,et al.  View-Invariant Recognition Using Corresponding Object Fragments , 2004, ECCV.

[3]  I. Gauthier,et al.  Visual object understanding , 2004, Nature Reviews Neuroscience.

[4]  T. Palmeri,et al.  Learning categories at different hierarchical levels: A comparison of category learning models , 1999, Psychonomic bulletin & review.

[5]  Shimon Ullman,et al.  Class Information Predicts Activation by Object Fragments in Human Object Areas , 2008, Journal of Cognitive Neuroscience.

[6]  S. Hochstein,et al.  The reverse hierarchy theory of visual perceptual learning , 2004, Trends in Cognitive Sciences.

[7]  J. Hegdé,et al.  Fragment-Based Learning of Visual Object Categories , 2008, Current Biology.

[8]  Richard N Aslin,et al.  Statistical learning of new visual feature combinations by infants , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Eleanor Rosch,et al.  Principles of Categorization , 1978 .

[10]  Daniel Kersten,et al.  Bootstrapped learning of novel objects. , 2003, Journal of vision.

[11]  A. Markman,et al.  Category use and category learning. , 2003, Psychological bulletin.

[12]  Shimon Ullman,et al.  Mutual information of image fragments predicts categorization in humans: Electrophysiological and behavioral evidence , 2007, Vision Research.

[13]  Robert R. Sokal,et al.  A Phylogenetic Analysis of the Caminalcules. II. Estimating the True Cladogram , 1983 .

[14]  Michael J. Tarr,et al.  Visual object recognition , 2002 .

[15]  E. Rosch,et al.  Categorization of Natural Objects , 1981 .

[16]  A. Yuille,et al.  Object perception as Bayesian inference. , 2004, Annual review of psychology.

[17]  S Edelman,et al.  A model of visual recognition and categorization. , 1997, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[18]  Wayne D. Gray,et al.  Basic objects in natural categories , 1976, Cognitive Psychology.

[19]  I. Biederman,et al.  Recognizing depth-rotated objects: Evidence and conditions for three-dimensional viewpoint invariance. , 1993 .

[20]  S. Ullman Object recognition and segmentation by a fragment-based hierarchy , 2007, Trends in Cognitive Sciences.

[21]  David L. Faigman,et al.  Human category learning. , 2005, Annual review of psychology.

[22]  Edward E. Smith,et al.  Categories and concepts , 1984 .

[23]  Robert R. Sokal,et al.  A Phylogenetic Analysis of the Caminalcules. I. the Data Base , 1983 .

[24]  M. Tarr,et al.  Training ‘greeble’ experts: a framework for studying expert object recognition processes , 1998, Vision Research.

[25]  Michel Vidal-Naquet,et al.  Visual features of intermediate complexity and their use in classification , 2002, Nature Neuroscience.