Learning from Ambiguity

There are many learning problems for which the examples given by the teacher are ambiguously labeled. In this thesis, we will examine one framework of learning from ambiguous examples known as Multiple-Instance learning. Each example is a bag, consisting of any number of instances. A bag is labeled negative if all instances in it are negative. A bag is labeled positive if at least one instance in it is positive. Because the instances themselves are not labeled, each positive bag is an ambiguous example. We would like to learn a concept which will correctly classify unseen bags. We have developed a measure called Diverse Density and algorithms for learning from multiple-instance examples. We have applied these techniques to problems in drug design, stock prediction, and image database retrieval. These serve as examples of how to translate the ambiguity in the application domain into bags, as well as successful examples of applying Diverse Density techniques. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.)

[1]  J. Meigs,et al.  WHO Technical Report , 1954, The Yale Journal of Biology and Medicine.

[2]  Verzekeren Naar Sparen,et al.  Cambridge , 1969, Humphrey Burton: In My Own Time.

[3]  Tom Michael Mitchell,et al.  Model-directed learning of production rules , 1977, SGAR.

[4]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[5]  Tom M. Mitchell,et al.  MODEL-DIRECTED LEARNING OF PRODUCTION RULES1 , 1978 .

[6]  Tom Michael Mitchell Version spaces: an approach to concept learning. , 1979 .

[7]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[8]  V. A. Morozov,et al.  Methods for Solving Incorrectly Posed Problems , 1984 .

[9]  George Henry Dunteman,et al.  Introduction To Multivariate Analysis , 1984 .

[10]  Michael Ian Shamos,et al.  Computational geometry: an introduction , 1985 .

[11]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[12]  James Kelly,et al.  AutoClass: A Bayesian Classification System , 1993, ML.

[13]  Garland R. Marshall,et al.  Constrained search of conformational hyperspace , 1989, J. Comput. Aided Mol. Des..

[14]  M. V. Rossum,et al.  In Neural Computation , 2022 .

[15]  William H. Press,et al.  Book-Review - Numerical Recipes in Pascal - the Art of Scientific Computing , 1989 .

[16]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[17]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[18]  Geoffrey E. Hinton,et al.  Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[19]  Kevin J. Lang A time delay neural network architecture for speech recognition , 1989 .

[20]  James D. Keeler,et al.  Integrated Segmentation and Recognition of Hand-Printed Numerals , 1990, NIPS.

[21]  Stephen M. Omohundro,et al.  Bumptrees for Efficient Function, Constraint and Classification Learning , 1990, NIPS.

[22]  H. Hirsh Incremental Version-Space Merging: A General Framework for Concept Learning , 1990 .

[23]  Stig K. Andersen,et al.  Probabilistic reasoning in intelligent systems: Networks of plausible inference , 1991 .

[24]  John S. Kauer,et al.  Contributions of topography and parallel processing to odor coding in the vertebrate olfactory pathway , 1991, Trends in Neurosciences.

[25]  Thomas G. Dietterich,et al.  In Advances in Neural Information Processing Systems 12 , 1991, NIPS 1991.

[26]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[27]  Scott E. Decatur Statistical queries and faulty PAC oracles , 1993, COLT '93.

[28]  Thomas G. Dietterich,et al.  A Comparison of Dynamic Reposing and Tangent Distance for Drug Activity Prediction , 1993, NIPS.

[29]  G. Kane Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol 1: Foundations, vol 2: Psychological and Biological Models , 1994 .

[30]  Steven W. Norton,et al.  Learning to Recognize Promoter Sequences in E. coli by Modeling Uncertainty in the Training Data , 1994, AAAI.

[31]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[32]  Umesh V. Vazirani,et al.  An Introduction to Computational Learning Theory , 1994 .

[33]  Thomas G. Dietterich,et al.  Compass: A shape-based machine learning tool for drug design , 1994, J. Comput. Aided Mol. Des..

[34]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[35]  R. Trippi Chaos & nonlinear dynamics in the financial markets : theory, evidence and applications , 1995 .

[36]  Peter Dayan,et al.  Competition and Multiple Cause Models , 1995, Neural Comput..

[37]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[38]  Tomaso A. Poggio,et al.  Regularization Theory and Neural Networks Architectures , 1995, Neural Computation.

[39]  Eric Saund,et al.  A Multiple Cause Mixture Model for Unsupervised Learning , 1995, Neural Computation.

[40]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[41]  William W. Cohen The Dual DFA Learning Problem: Hardness Results for Programming by Demonstration and Learning First-Order Representations (Extended Abstract). , 1996, COLT 1996.

[42]  Pamela R. Lipson,et al.  Context and configuration based scene classification , 1996 .

[43]  Amarnath Gupta,et al.  Virage image search engine: an open framework for image management , 1996, Electronic Imaging.

[44]  Philip M. Long,et al.  PAC Learning Axis-Aligned Rectangles with Respect to Product Distributions from Multiple-Instance Examples , 1996, COLT.

[45]  Philip M. Long,et al.  PAC Learning Axis-aligned Rectangles with Respect to Product Distributions from Multiple-Instance Examples , 1996, COLT '96.

[46]  Tom Minka,et al.  Interactive learning with a "Society of Models" , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[47]  Rosalind W. Picard,et al.  Interactive Learning Using a "Society of Models" , 2017, CVPR 1996.

[48]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[49]  Aravind Srinivasan,et al.  Approximating hyper-rectangles: learning and pseudo-random sets , 1997, STOC '97.

[50]  Tomás Lozano-Pérez,et al.  A Framework for Multiple-Instance Learning , 1997, NIPS.

[51]  Jing Huang,et al.  Image indexing using color correlograms , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[52]  Peter Auer,et al.  On Learning From Multi-Instance Examples: Empirical Evaluation of a Theoretical Approach , 1997, ICML.

[53]  Shih-Fu Chang,et al.  VisualSEEk: a fully automated content-based image query system , 1997, MULTIMEDIA '96.

[54]  Aravind Srinivasan,et al.  Approximating Hyper-Rectangles: Learning and Pseudorandom Sets , 1998, J. Comput. Syst. Sci..

[55]  Oded Maron,et al.  Multiple-Instance Learning for Natural Scene Classification , 1998, ICML.

[56]  Parry Husbands,et al.  The Parallel Problems Server: A Client-Server Model for Interactive Large Scale Scientific Computation , 1998, VECPAR.

[57]  Alex M. Andrew,et al.  Reinforcement Learning: : An Introduction , 1998 .

[58]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[59]  Jitendra Malik,et al.  Color- and texture-based image segmentation using EM and its application to content-based image retrieval , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[60]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.