XRCE's Participation at Medical Image Modality Classification and Ad-hoc Retrieval Tasks of Image CLEF2011

The aim of this document is to describe our methods used in the Medical Image Modality Classification and Ad-hoc Image Retrieval Tasks of ImageClef 2011. The main novelty in medical image modality classification this year was, that there were more classes (18 modalities) organized in a hierarchy and for some categories only few annotated examples were available. Therefore, our strategy in image categorization was to use a semi-supervised approach. In our experiments, we investigated mono-modal (text and image) and mixed modality based classification. The image classification was based on Fisher Vectors built on SIFT-like local orientation histograms and local color statistics. For text representation we used a binarized bag-ofwords representation where each element indicated whether the term appeared in the image caption or not. In the case of multi-modal classification, we simply averaged the text and image classification scores. For the ad-hoc retrieval task, we used the image captions for text retrieval and Fisher Vectors for visual similarity and modality detection. Our text runs were based on a late fusion of dierent state of the art text experts and the Lexical Entailment model. This Lexical Entailement model used the last year articles to compute similarities between terms and rank first at the previous challenge. Concerning the submitted runs, we realized that we forgot by inadvertance 3 , to submit our best run from last year [3]. We did not submit either improvement over this run, which was proposed in [6]. Overall, this explain the medium performance of our submitted runs. In this document, we show that our system from last year and its improvements would have achieve top performance. We have not tuned the parameter of this system for this year task, we have just evaluated the runs we did not submit