Key-point detection with multi-layer center-surround inhibition

We present a biologically inspired algorithm for key-point detection based on multi-layer and nonlinear center-surround inhibition. A Bag-of-Visual-Words framework is used to evaluate the performance of the detector on the Oxford III-T Pet Dataset for pet recognition. The results demonstrate an increased performance of our algorithm compared to the SIFT key-point detector. We further improve the recognition rate by separately training codebooks for the ON- and OFF-type key points. The results show that our key-point detection algorithms outperform the SIFT detector by having a lower recognition-error rate over a whole range of different key-point densities. Randomly selected key-points are also outperformed.

[1]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[2]  Christoph Zetzsche,et al.  Nonlinear encoding in multilayer LNL systems optimized for the representation of natural images , 2007, Electronic Imaging.

[3]  Erhardt Barth,et al.  Endstopped operators based on iterated nonlinear center-surround inhibition , 1998, Electronic Imaging.

[4]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[5]  C. V. Jawahar,et al.  Cats and dogs , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Thomas Martinetz,et al.  Intrinsic Dimensionality Predicts the Saliency of Natural Dynamic Scenes , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  E. Barth,et al.  On the Uniqueness of Curvature Features 1 2 , 2000 .

[8]  Cordelia Schmid,et al.  A Comparison of Affine Region Detectors , 2005, International Journal of Computer Vision.

[9]  Frédéric Jurie,et al.  Sampling Strategies for Bag-of-Features Image Classification , 2006, ECCV.

[10]  Geoffrey E. Hinton Learning multiple layers of representation , 2007, Trends in Cognitive Sciences.

[11]  C. Zetzsche,et al.  Fundamental limits of linear filters in the visual processing of two-dimensional signals , 1990, Vision Research.

[12]  Michael Dorr,et al.  Saliency-based selection of sparse descriptors for action recognition , 2012, 2012 19th IEEE International Conference on Image Processing.

[13]  Gert Cauwenberghs,et al.  Neuromorphic Silicon Neuron Circuits , 2011, Front. Neurosci.

[14]  Andrea Vedaldi,et al.  Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.

[15]  E. Barth,et al.  On the Uniqueness of Curvature Features 1 , 2000 .

[16]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[17]  Luca Maria Gambardella,et al.  Deep, Big, Simple Neural Nets for Handwritten Digit Recognition , 2010, Neural Computation.