Learning Object Representations for Visual Object Class Recognition

This talk discussed our object-class recognition method that won the classification contest of the Pascal VOC Challenge 2007. We submitted two recognition methods sharing the same underlying image representations defined by a choice of image sampler, local descriptor and global spatial grid. The submitted methods also share the classifier, which is a one-against-rest non-linear Support Vector Machine with chi-square kernel. The methods differ in the way they combine multiple representations (channels). The first method is based on the approach of Zhang et al., where the final similarity measure is the sum of per-channel similarities. The second method employs a genetic algorithm, which is used to determine (on per-class basis) the parameters of the generalized RBF kernel incorporating all the channels, i.e., to estimate the importance of each sampling/description/spatial method for the recognition and to optimize the required level of generalization. Both methods showed superior performance compared to other state-of-the-art submissions.