Text-based image retrieval using progressive multi-instance learning

Relevant and irrelevant images collected from the Web (e.g., Flickr.com) have been employed as loosely labeled training data for image categorization and retrieval. In this work, we propose a new approach to learn a robust classifier for text-based image retrieval (TBIR) using relevant and irrelevant training web images, in which we explicitly handle noise in the loose labels of training images. Specifically, we first partition the relevant and irrelevant training web images into clusters. By treating each cluster as a “bag” and the images in each bag as “instances”, we formulate this task as a multi-instance learning problem with constrained positive bags, in which each positive bag contains at least a portion of positive instances. We present a new algorithm called MIL-CPB to effectively exploit such constraints on positive bags and predict the labels of test instances (images). Observing that the constraints on positive bags may not always be satisfied in our application, we additionally propose a progressive scheme (referred to as Progressive MIL-CPB, or PMIL-CPB) to further improve the retrieval performance, in which we iteratively partition the top-ranked training web images from the current MIL-CPB classifier to construct more confident positive “bags” and then add these new “bags” as training data to learn the subsequent MIL-CPB classifiers. Comprehensive experiments on two challenging real-world web image data sets demonstrate the effectiveness of our approach.

[1]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[2]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[3]  J. E. Kelley,et al.  The Cutting-Plane Method for Solving Convex Programs , 1960 .

[4]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[5]  Tomás Lozano-Pérez,et al.  A Framework for Multiple-Instance Learning , 1997, NIPS.

[6]  Jun Wang,et al.  Solving the Multiple-Instance Problem: A Lazy Learning Approach , 2000, ICML.

[7]  Qi Zhang,et al.  EM-DD: An Improved Multiple-Instance Learning Technique , 2001, NIPS.

[8]  Thomas Hofmann,et al.  Support Vector Machines for Multiple-Instance Learning , 2002, NIPS.

[9]  A. ADoefaa,et al.  ? ? ? ? f ? ? ? ? ? , 2003 .

[10]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[11]  Pietro Perona,et al.  Learning object categories from Google's image search , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[12]  Yixin Chen,et al.  MILES: Multiple-Instance Learning via Embedded Instance Selection , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Peter V. Gehler,et al.  Deterministic Annealing for Multiple-Instance Learning , 2007, AISTATS.

[14]  Antonio Criminisi,et al.  Harvesting Image Databases from the Web , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[15]  Razvan C. Bunescu,et al.  Multiple instance learning for sparse positive bags , 2007, ICML '07.

[16]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[17]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[18]  James R. Foulds,et al.  Revisiting Multiple-Instance Learning Via Embedded Instance Selection , 2008, Australasian Conference on Artificial Intelligence.

[19]  Bernt Schiele,et al.  Decomposition, discovery and detection of visual categories using topic models , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Kristen Grauman,et al.  Keywords to visual categories: Multiple-instance learning forweakly supervised object categorization , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Wei-Ying Ma,et al.  Annotating Images by Mining Image Search Results , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[23]  Marcel Worring,et al.  Learning Social Tag Relevance by Neighbor Voting , 2009, IEEE Transactions on Multimedia.

[24]  Yan Song,et al.  An Improved Multiple Instance Learning Algorithm for Object Extraction , 2010, 2010 Chinese Conference on Pattern Recognition (CCPR).

[25]  Cristian Sminchisescu,et al.  Convex Multiple-Instance Learning by Estimating Likelihood Ratio , 2010, NIPS.

[26]  Ivor W. Tsang,et al.  Tag-based web photo retrieval improved by batch mode re-tagging , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[27]  Ivor W. Tsang,et al.  Improving Web Image Search by Bag-Based Reranking , 2011, IEEE Transactions on Image Processing.

[28]  Ivor W. Tsang,et al.  Textual Query of Personal Photos Facilitated by Large-Scale Web Data , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Joost van de Weijer,et al.  Object and Scene Recognition , 2012 .