Visual Object Recognition in Diverse Scenes with Multiple Instance Learning

Visual object recognition is important to the robot industry and is a prerequisite for other robot functionalities, such as grasping and manipulation. Object representation and a learning technique are two indispensable parts for this demanding task while arbitrary object appearance and diverse scenes with cluttered background are two great challenges. However, compared with object representation, the learning technique is less developed to deal with these challenges. This paper extends the multiple instance learning (MIL) technique to the multi-class classification scenario and introduces this multi-class MIL framework to the object recognition domain for the first time. This framework is independent of object representation and is useful for object/background discrimination in unseen scenes. Preliminary experiments show that it compares favorably with the supervised learning approach which takes whole images as the classifier training input

[1]  B. Schiele,et al.  Combined Object Categorization and Segmentation With an Implicit Shape Model , 2004 .

[2]  Hui Zhang,et al.  Localized Content-Based Image Retrieval , 2008, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Oded Maron,et al.  Learning from Ambiguity , 1998 .

[4]  Dong Wang,et al.  Multiple-Instance Learning Via Random Walk , 2006, ECML.

[5]  Jan Ramon,et al.  Multi instance neural networks , 2000, ICML 2000.

[6]  Qi Zhang,et al.  EM-DD: An Improved Multiple-Instance Learning Technique , 2001, NIPS.

[7]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[8]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[9]  Giancarlo Ruffo,et al.  Learning single and multiple instance decision tree for computer security applications , 2000 .

[10]  Gary M. Weiss Mining with rarity: a unifying framework , 2004, SKDD.

[11]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[12]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[13]  Peter Auer,et al.  Generic object recognition with boosting , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[15]  Valerie Isham,et al.  Non‐Negative Matrices and Markov Chains , 1983 .

[16]  Jun Wang,et al.  Solving the Multiple-Instance Problem: A Lazy Learning Approach , 2000, ICML.

[17]  E. Seneta Non-negative Matrices and Markov Chains , 2008 .

[18]  Dan Roth,et al.  Learning to detect objects in images via a sparse, part-based representation , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Yixin Chen,et al.  Image Categorization by Learning and Reasoning with Regions , 2004, J. Mach. Learn. Res..

[20]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[21]  B. S. Manjunath,et al.  Unsupervised Segmentation of Color-Texture Regions in Images and Video , 2001, IEEE Trans. Pattern Anal. Mach. Intell..