论文信息 - Preserving feature layout information for object recognition

Preserving feature layout information for object recognition

In this paper, we have proposed a method to preserve layout information of feature maps, which is vanished in fully connected layer, for object classification tasks. In bag-of-features framework, codebook encodes an image to produce a group of response maps by convolution operation. And the maps hold location information of features in regular grids. To obtain mid-level representation, it is common to concatenate all the features into a long feature vector. Based on the representation, a linear classifier or full-connected layer is implemented to predict labels. However, because of the disappearance of regular girds, the spatial information of features vanish before being fed to the higher layer, even though it is preserved in the feature extraction process. In this paper, this problem is addressed by applying spatial description feature (SDF) to preserve more useful information with modified spatial pyramids. In addition, to enhance the performance of SDFs, we have designed a forward–backward strategy to select receptive fields. In the experiment, it is shown that the knowledge of feature spatial layout can promote classification and the forward–backward learning scheme can generate a compact and high-performance pipeline. © 2016 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.

Shuzhi Sam Ge | Li Ma | Qian Zhao | Sibang Liu

[1] Tieniu Tan,et al. Feature Coding in Image Classification: A Comprehensive Study , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2] Yoshua Bengio,et al. Maxout Networks , 2013, ICML.

[3] Samuel Kaski,et al. ICML The 28th International Conference on Machine Learning (ICML-11) ICML , 2011 .

[4] Andrea J. van Doorn,et al. The Structure of Locally Orderless Images , 1999, International Journal of Computer Vision.

[5] David J. Field,et al. Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[6] James Theiler,et al. Grafting: Fast, Incremental Feature Selection by Gradient Descent in Function Space , 2003, J. Mach. Learn. Res..

[7] D. Hubel,et al. Receptive fields, binocular interaction and functional architecture in the cat's visual cortex , 1962, The Journal of physiology.