The Joint Submission of the TU Berlin and Fraunhofer FIRST (TUBFI) to the ImageCLEF2011 Photo Annotation Task

In this paper we present details on the joint submission of TU Berlin and Fraunhofer FIRST to the ImageCLEF 2011 Photo Annotation Task. We sought to experiment with extensions of Bag-of-Words (BoW) models at several levels and to apply several kernel-based learning methods recently developed in our group. For classifier training we used non-sparse multiple kernel learning (MKL) and an efficient multi-task learning (MTL) heuristic based on MKL over kernels from classifier outputs. For the multi-modal fusion we used a smoothing method on tag-based features inspired by Bag-of-Words soft mappings and Markov ran- dom walks. We submitted one multi-modal run extended by the user tags and four purely visual runs based on Bag-of-Words models. Our best visual result which used the MTL method was ranked first according to mean average preci- sion (MAP) within the purely visual submissions. Our multi-modal submission achieved the first rank by MAP among the multi-modal submissions and the best MAP among all submissions. Submissions by other groups such as BPACAD, CAEN, UvA-ISIS, LIRIS were ranked closely.

[1]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[2]  Cordelia Schmid,et al.  Multimodal semi-supervised learning for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[3]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[4]  Motoaki Kawanabe,et al.  Multi-task Learning via Non-sparse Multiple Kernel Learning , 2011, CAIP.

[5]  Daniel Sheldon,et al.  Graphical Multi-Task Learning , 2008 .

[6]  Gabriela Csurka,et al.  LEAR and XRCE's Participation to Visual Concept Detection Task - ImageCLEF 2010 , 2010, CLEF.

[7]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[8]  Nanning Zheng,et al.  A biased sampling strategy for object categorization , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[9]  Koen E. A. van de Sande,et al.  The University of Amsterdam's Concept Detection System at ImageCLEF 2009 , 2009, CLEF.

[10]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[11]  Alexander Zien,et al.  lp-Norm Multiple Kernel Learning , 2011, J. Mach. Learn. Res..

[12]  Koen E. A. van de Sande,et al.  Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Martin Braschler,et al.  CLEF 2010 LABs and Workshops, Notebook Papers, 22-23 September 2010, Padua, Italy , 2010, CLEF.

[14]  Motoaki Kawanabe,et al.  Multi-modal visual concept classification of images via Markov random walk over tags , 2011, 2011 IEEE Workshop on Applications of Computer Vision (WACV).

[15]  Frédéric Jurie,et al.  Sampling Strategies for Bag-of-Features Image Classification , 2006, ECCV.

[16]  M. Kloft,et al.  l p -Norm Multiple Kernel Learning , 2011 .

[17]  Charles A. Micchelli,et al.  Learning Multiple Tasks with Kernel Methods , 2005, J. Mach. Learn. Res..

[18]  Cor J. Veenman,et al.  Kernel Codebooks for Scene Categorization , 2008, ECCV.

[19]  Motoaki Kawanabe,et al.  On Taxonomies for Multi-class Image Categorization , 2012, International Journal of Computer Vision.

[20]  Cordelia Schmid,et al.  Learning Object Representations for Visual Object Class Recognition , 2007, ICCV 2007.

[21]  Stefanie Nowak,et al.  The CLEF 2011 Photo Annotation and Concept-based Retrieval Tasks , 2011, CLEF.