论文信息 - Concept learning in the absence of counterexamples: an autoassociation-based approach to classification

Concept learning in the absence of counterexamples: an autoassociation-based approach to classification

OF THE DISSERTATION Concept-Learning in the Absence of Counter-Examples: An Autoassociation-Based Approach to Classi cation by Nathalie Japkowicz Dissertation Directors: Stephen Jos e Hanson and Casimir Kulikowski The overwhelming majority of research currently pursued within the framework of concept-learning concentrates on discrimination-based learning, an inductive learning paradigm that relies on both examples and counter-examples of the concept. This emphasis, however, can present a practical problem: there are real-world engineering problems for which counter-examples are both scarce and di cult to gather. For these problems, recognition-based learning systems are much more appropriate because they do not use counter-examples in the conceptlearning phase. The purpose of this dissertation is to analyze a connectionist recognition-based learning system|autoassociation-based classi cation|and answer the following questions: What features of the autoassociator make it capable of performing classi cation in the absence of counter-examples? What causes the autoassociator to be signi cantly more e cient than MLP in certain domains? What domain characteristics cause the autoassociator to be more accurate than MLP and MLP to be more accurate than the autoassociator? The dissertation concludes that 1) autoassociation-based classi cation is possible in a particular class of practical domains called non-linear and multi-modal because the autoassociator uses a multi-modal specialization bias to compensate for the absence of counter-examples. This bias can be controlled by varying the ii capacity of the autoassociator. 2) The di erence in e ciency between the autoassociator and MLP observed on this class of domains is caused by the fact that the autoassociator uses a (fast) data-driven generalization strategy whereas MLP has recourse to a (slow) hypothesis-driven one, despite the fact that the two systems are both trained by the backpropagation procedure. 3) The autoassociator classi es more accurately than MLP domains requiring particularly strong specialization biases caused by the counter-conceptual class or particularly weak specialization biases caused by the conceptual class. However, MLP is more accurate than the autoassociator on domains requiring particularly strong specialization biases caused by the conceptual class. The results of this study thus suggest that recognition-based systems, which are often dismissed in favor of discrimination-based ones in the context of conceptlearning, may present an interesting array of classi cation strengths. iii Acknowledgements I would like to thank all the people who, in one way or another, helped me complete my Ph.D. First, I am grateful to my advisors, Stephen Jos e Hanson and Casimir A. Kulikowski, for their guidance and advice during the development of the research presented in this dissertation. I appreciated Steve's willingness to listen to my ideas and help me transform them into interesting research questions and I am thankful to Professor Kulikowski for patiently helping me turn them into a coherent whole. I would also like to thank my other committee members. In particular, I am grateful to Mark A. Gluck for o ering me the chance to work on an exciting problem|which subsequently turned into a dissertation topic|, for his generous nancial support and for his friendly encouragement. I would also like to thank Robert Vichnevetsky for his kindness and support both during the early and later years of my Ph.D. studies. As well, I would like to thank Craig Nevill-Manning for agreeing to join my thesis committee at a later date and for his thoughtful insights about the project. Although Vasek Chvatal was not part of my thesis committee, I am deeply indebted to him for the role he played as a Graduate Chair. He took an active part in helping me put my thesis committee together and facilitating communication among myself and the members of my committee. As well, I would like to thank Andrew Gelsey for acting as an interim advisor from January to August 1995. I have also bene ted from interaction with people outside my department. In particular, I received very useful advice on how to formalize my ideas from Nathan Intrator of Tel-Aviv University. Gary Cottrell of UCSD together with an anonymous reviewer provided very insightful comments on the study reported in Chapter 5. Catherine Myers of Rutgers-Newark provided me with a lot of help and advice during my rst few years in Mark Gluck's Lab and Evangelia MicheliTzanakou of Rutgers-New Brunswick provided a warm learning environment in her department. I would also like to thank the various people who provided me with exciting opportunities during my years as a graduate student. In particular I would like to thank Stu Zweben for hiring me as a lecturer at Ohio State University during the 1999 Winter and Spring quarters; Nathan Intrator for o ering me a research fellowship at Tel-Aviv University during the 1997-1998 school year; Mark Gluck for hiring me as a research assistant in his Rutgers-Newark Lab from April 1994 to September 1997; William Cohen for hiring me as a Research Assistant at Bell Laboratories during the 1992 Spring semester; Jean-Gabriel Ganascia for iv providing me with space in his research lab at l'Universite Pierre et Marie Curie in Paris in the 1995 Spring quarter; John Taylor for doing the same at his King's College Lab in London in the 1995 Winter quarter; and Christiane Fellbaum, George Miller and Randee Tengi for welcoming me in the Cognitive Science Lab of Princeton University where I conducted the greatest part of my research. My life as a graduate student would not have been the same without the support of many friends both at Rutgers and elsewhere. I am thankful to Stephen Kwek for his friendship, advice, and support during both my undergraduate and graduate years and despite the many miles often separating us. Since I stepped foot at Rutgers and until the last minutes I spent there, Dawn Cohen was a constant source of encouragement and advice helping me avoid the obstacles of a graduate career and cope with those I didn't avoid. Inna Stainvas of Tel-Aviv University made the year I spent there a very pleasant and instructive one as we spent many hours discussing various aspects of our projects while working side by side on our dissertations. I also very much enjoyed the terms I spent TA'ing Data Structures with Sesh Venugopal as well as our many \philosophical" conversations. I am thankful to my friends Frederique Alexandre, Rodrigo Arias, Florence Buzeyne, Tim Cooley, Claudia Goldmann, Peter Mones, Hava Siegelmann, Zehra Venugopal as well as o ce-mates Phil Bohannon, Rick Kaye and Fan Min for contributing to making my years as a graduate student most rewarding. Sadly, I can only wish that Paula Bonato whose friendship and kindness I have much missed since her untimely passing in July 1992 were here today to share in this moment. Finally, I would like to thank my family for their patience and support throughout my studies. My parents, Michel and Suzanne Japkowicz as well as my sister, Florence Japkowicz-Ricouvier, and my brother, Maxime Japkowicz, always showed great interest in my work. Their constant encouragement, their pride in my successes and their deep concern for my di culties are some of the major forces that helped me focus on my work and carry it through successfully. Robert and Hedwin Naimark, my cousins, were also very supportive and I much appreciated their advice on various aspects of the academic world. Toba and Michael Ripsman, my in-laws, were most understanding and supportive of my e orts and I appreciated them sharing my joys and disappointments as they would have for their own daughter. Last but not least, I would like to thank Norrin Ripsman, my husband, for all the help, support, and love, without which I could not have completed my degree. Despite the demands of his own Ph.D. and Postdoctoral studies, Norrin invested a lot of his time and energy helping me solve the problems I encountered during my graduate career and getting through its most trying problems. I am thankful for his patience, his understanding and his caring. v Dedication I dedicate this dissertation to my parents, Michel and Suzanne Japkowicz, and my husband, Norrin M. Ripsman vi Table of

[1] Robert E. Schapire,et al. The strength of weak learnability , 1990, Mach. Learn..

[2] E. Oja. The Nonlinear PCA Learning Rule and Signal Separation - Mathematical Analysis , 1995 .

[3] Jan de Leeuw,et al. Nonlinear Principal Component Analysis , 1982 .

[4] Tom M. Mitchell,et al. Generalization as Search , 2002 .

[5] M. Turk,et al. Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[6] Dean Pomerleau,et al. Input Reconstruction Reliability Estimation , 1992, NIPS.

[7] Nathan Intrator,et al. Combining Exploratory Projection Pursuit and Projection Pursuit Regression with Application to Neural Networks , 1993, Neural Computation.

[8] Paul W. Munro,et al. Visualizations of 2-D hidden unit space , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[9] John Moody,et al. Prediction Risk and Architecture Selection for Neural Networks , 1994 .

[10] Keinosuke Fukunaga,et al. Introduction to Statistical Pattern Recognition , 1972 .

[11] Yoav Freund,et al. Boosting a weak learning algorithm by majority , 1990, COLT '90.

[12] Steven W. Norton,et al. Learning to Recognize Promoter Sequences in E. coli by Modeling Uncertainty in the Training Data , 1994, AAAI.

[13] L. Squire. CHAPTER 7 – Memory and the Brain* , 1986 .

[14] Eric Saund,et al. Dimensionality-Reduction Using Connectionist Networks , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[15] Heekuck Oh,et al. Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[16] D Zipser,et al. Learning the hidden structure of speech. , 1988, The Journal of the Acoustical Society of America.