Adaptive Multi-class Correlation Filters

Correlation filters have attracted growing attention due to their high efficiency, which have been well studied for binary classification. However, by setting the desired output to be a fixed Gaussian function, the conventional multi-class classification based on correlation filters becomes problematic due to the under-fitting in many real-world applications. In this paper, we propose an adaptive multi-class correlation filters (AMCF) method based on an alternating direction method of multipliers (ADMM) framework. Within this framework, we introduce an adaptive output to alleviate the under-fitting problem in the ADMM iterations. By doing so, a closed-form sub-solution is obtained and further used to constrain the optimization objective, simplifying the entire inference mechanism. The proposed approach is successfully combined with the Histograms of Oriented Gradients (HOG) features, multi-channel features and convolution features, and achieves superior performances over state-of-the-arts in two multi-class classification tasks including handwritten digits recognition and RGBD-based action recognition.

[1]  Wanqing Li,et al.  Action recognition based on a bag of 3D points , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[2]  Zicheng Liu,et al.  HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Rama Chellappa,et al.  Human Action Recognition by Representing 3D Skeletons as Points in a Lie Group , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Anton van den Hengel,et al.  StructBoost: Boosting Methods for Predicting Structured Output Variables. , 2014, IEEE transactions on pattern analysis and machine intelligence.

[5]  Alessio Del Bue,et al.  Sparse representation classification with manifold constraints transfer , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[7]  Ying Wu,et al.  Mining actionlet ensemble for action recognition with depth cameras , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Bruce A. Draper,et al.  Visual object tracking using adaptive correlation filters , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Simon Lucey,et al.  Multi-channel Correlation Filters , 2013, 2013 IEEE International Conference on Computer Vision.

[10]  Nasser Kehtarnavaz,et al.  Action Recognition from Depth Sequences Using Depth Motion Maps-Based Local Binary Patterns , 2015, 2015 IEEE Winter Conference on Applications of Computer Vision.

[11]  B. V. K. Vijaya Kumar,et al.  Maximum Margin Correlation Filter: A New Approach for Localization and Classification , 2013, IEEE Transactions on Image Processing.

[12]  Xiaodong Yang,et al.  Recognizing actions using depth motion maps-based histograms of oriented gradients , 2012, ACM Multimedia.

[13]  Yun Yang,et al.  Action recognition using completed local binary patterns and multiple-class boosting classifier , 2015, 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR).

[14]  Ming-Hsuan Yang,et al.  Hierarchical Convolutional Features for Visual Tracking , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[15]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[16]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[17]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[18]  Jake K. Aggarwal,et al.  Spatio-temporal Depth Cuboid Similarity Feature for Activity Recognition Using Depth Camera , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[20]  Mario Fernando Montenegro Campos,et al.  On the improvement of human action recognition from depth map sequences using Space-Time Occupancy Patterns , 2014, Pattern Recognit. Lett..

[21]  Hong Liu,et al.  3D Action Recognition Using Multi-Temporal Depth Motion Maps and Fisher Vector , 2016, IJCAI.

[22]  Junwei Han,et al.  Efficient, simultaneous detection of multi-class geospatial targets based on visual saliency modeling and discriminative learning of sparse coding , 2014 .

[23]  A Mahalanobis,et al.  Distance-classifier correlation filters for multiclass target recognition. , 1996, Applied optics.

[24]  A. Willsky,et al.  Signals and Systems , 2004 .

[25]  Rui Caseiro,et al.  High-Speed Tracking with Kernelized Correlation Filters , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Ying Wu,et al.  Robust 3D Action Recognition with Random Occupancy Patterns , 2012, ECCV.