Tagging over time: real-world image annotation by lightweight meta-learning

Automatic image annotation has been a hot-pursuit among multimedia researchers of late. Modest performance guarantees and limited adaptability often restrict its applicability to real-world settings. We propose tagging over time (T/T) to push the technology toward real-world applicability. Of particular interest are online systems that receive user-provided images and feedback over time, with user focus possibly changing and evolving. The T/T framework consists of a principled probabilistic approach to meta-learning, which acts as a go-between for a 'black-box' annotation system and the users. Inspired by inductive transfer, the approach attempts to harness available information, including the black-box model's performance, the image representations, and the WordNet ontology. Being computationally 'lightweight', this meta-learner efficiently re-trains over time, to improve and/or adapt to changes. The black-box annotation model is not required to be re-trained, allowing computationally intensive algorithms to be used. We experiment with standard image datasets and real-world data streams, using two existing annotation systems as black-boxes. Both batch and online annotation settings are experimented with. It is observed that the addition of this meta-learning layer produces much improved results that outperform best-known results. For the online setting, the T/T approach produces progressively better annotation with time, significantly outperforming the black-box as well as the static form of the meta-learner, on real-world data.

[1]  John R. Smith,et al.  A web-based system for collaborative annotation of large image and video collections: an evaluation and user study , 2005, MULTIMEDIA '05.

[2]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[3]  Ido Dagan,et al.  Similarity-Based Models of Word Cooccurrence Probabilities , 1998, Machine Learning.

[4]  Xiaoou Tang,et al.  Learning an image-word embedding for image auto-annotation on the nonlinear latent space , 2005, MULTIMEDIA '05.

[5]  Massimiliano Pontil,et al.  Best Of NIPS 2005: Highlights on the 'Inductive Transfer : 10 Years Later' Workshop , 2006 .

[6]  R. Manmatha,et al.  Multiple Bernoulli relevance models for image and video annotation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[7]  Ricardo Vilalta,et al.  A Perspective View and Survey of Meta-Learning , 2002, Artificial Intelligence Review.

[8]  David A. Forsyth,et al.  Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[9]  James Ze Wang,et al.  Toward bridging the annotation-retrieval gap in image search by a generative modeling approach , 2006, MM '06.

[10]  Christiane Fellbaum,et al.  Combining Local Context and Wordnet Similarity for Word Sense Identification , 1998 .

[11]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[12]  James Ze Wang,et al.  Real-Time Computerized Annotation of Pictures , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Edward Y. Chang,et al.  CBSA: content-based soft annotation for multimodal image retrieval using Bayes point machines , 2003, IEEE Trans. Circuits Syst. Video Technol..

[14]  Daniel Gatica-Perez,et al.  On image auto-annotation with latent space models , 2003, ACM Multimedia.

[15]  Jianping Fan,et al.  Automatic image annotation by incorporating feature hierarchy and boosting to scale up SVM classifiers , 2006, MM '06.

[16]  Latifur Khan,et al.  Image annotations by combining multiple evidence & wordNet , 2005, ACM Multimedia.

[17]  Rich Caruana,et al.  Multitask Learning , 1997, Machine-mediated learning.

[18]  Luo Si,et al.  Effective automatic image annotation via a coherent language model and active learning , 2004, MULTIMEDIA '04.

[19]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[20]  Wen Wu,et al.  SmartLabel: an object labeling tool using iterated harmonic energy minimization , 2006, MM '06.

[21]  Gert Cauwenberghs,et al.  Incremental and Decremental Support Vector Machine Learning , 2000, NIPS.

[22]  Gustavo Carneiro,et al.  Supervised Learning of Semantic Classes for Image Annotation and Retrieval , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Wei-Ying Ma,et al.  Image annotation by large-scale content-based image retrieval , 2006, MM '06.

[24]  Changhu Wang,et al.  Image annotation refinement using random walk with restarts , 2006, MM '06.