论文信息 - Robust visual tracking based on convolutional neural network with extreme learning machine

Robust visual tracking based on convolutional neural network with extreme learning machine

Recently, deep learning has attracted substantial attention as a promising solution to many problems in computer vision. Among various deep learning architectures, convolutional neural network (CNN) has demonstrated superior performance as a feature learning method. In this paper, we present a novel hybrid model of CNN and extreme learning machine (ELM) for object tracking. Training a conventional CNN requires a substantial amount of computation and a large dataset. ELM randomly generates the parameters of hidden layers and calculates network weights between output and hidden layers via the regularized least-square method, thereby dramatically reducing the learning time while producing accurate results with minimal training data. Therefore, we integrate the ELM auto-encoder architecture into the CNN model. In addition, an effective updating scheme is designed for the model training to overcome the tracking drift problem. The joint CNN-ELM tracker is robust to object variations such as illumination, occlusion, and rotation in a video sequence. Numerous experiments on various challenging videos demonstrate that the proposed tracker performs favourably compared to several state-of-the-art methods.

Xu Wang | Rui Sun | Xiaoxing Yan

[1] Dit-Yan Yeung,et al. Learning a Deep Compact Image Representation for Visual Tracking , 2013, NIPS.

[2] Haibin Ling,et al. Robust Visual Tracking and Vehicle Classification via Sparse Representation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3] Ido Leichter,et al. Mean Shift Trackers with Cross-Bin Metrics , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4] Minho Lee,et al. Fast learning method for convolutional neural networks using extreme learning machine and its application to lane detection , 2017, Neural Networks.

[5] Rui Caseiro,et al. Exploiting the Circulant Structure of Tracking-by-Detection with Kernels , 2012, ECCV.

[6] Heng Tao Shen,et al. Video Captioning With Attention-Based LSTM and Semantic Consistency , 2017, IEEE Transactions on Multimedia.

[7] Xin Wang,et al. Deep Reinforcement Learning for Visual Object Tracking in Videos , 2017, ArXiv.

[8] Narendra Ahuja,et al. Robust Visual Tracking via Structured Multi-Task Sparse Learning , 2012, International Journal of Computer Vision.

[9] Guang-Bin Huang,et al. An Insight into Extreme Learning Machines: Random Neurons, Random Features and Kernels , 2014, Cognitive Computation.

[10] Pong C. Yuen,et al. Robust Visual Tracking via Basis Matching , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[11] Nicu Sebe,et al. Deep appearance and motion learning for egocentric activity recognition , 2018, Neurocomputing.

[12] Jin Gao,et al. Transfer Learning Based Visual Tracking with Gaussian Processes Regression , 2014, ECCV.

[13] Thomas Serre,et al. Robust Object Recognition with Cortex-Like Mechanisms , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14] Baojun Zhao,et al. Visual Tracking Based on Extreme Learning Machine and Sparse Representation , 2015, Sensors.

[15] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[16] Ming-Hsuan Yang,et al. Incremental Learning for Robust Visual Tracking , 2008, International Journal of Computer Vision.

[17] Xiaoshuai Sun,et al. Two-Stream 3-D convNet Fusion for Action Recognition in Videos With Arbitrary Size and Length , 2018, IEEE Transactions on Multimedia.

[18] Chunhua Shen,et al. Real-time visual tracking using compressive sensing , 2011, CVPR 2011.

[19] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[20] Yi Li,et al. Robust Online Visual Tracking with a Single Convolutional Neural Network , 2014, ACCV.

[21] Gian Luca Foresti,et al. The Evolution of Neural Learning Systems: A Novel Architecture Combining the Strengths of NTs, CNNs, and ELMs , 2015, IEEE Systems, Man, and Cybernetics Magazine.

[22] Huchuan Lu,et al. Robust object tracking via sparsity-based collaborative model , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[23] Horst Bischof,et al. On-line Boosting and Vision , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[24] Shai Avidan,et al. Support Vector Tracking , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[25] Kenli Li,et al. An Ensemble CNN2ELM for Age Estimation , 2018, IEEE Transactions on Information Forensics and Security.

[26] Ming-Hsuan Yang,et al. Robust Object Tracking with Online Multiple Instance Learning , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27] Se-Young Oh,et al. Fast training of convolutional neural network classifiers through extreme learning machines , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[28] Pascal Vincent,et al. Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29] Yao Lu,et al. Locality-Constrained Collaborative Model for Robust Visual Tracking , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[30] 周鑫,et al. Tracking-learning-detection (TLD)-based video object tracking method , 2012 .

[31] Yihong Gong,et al. Human Tracking Using Convolutional Neural Networks , 2010, IEEE Transactions on Neural Networks.

[32] Huihui Song. Robust visual tracking via online informative feature selection , 2014 .

[33] Michael J. Black,et al. EigenTracking: Robust Matching and Tracking of Articulated Objects Using a View-Based Representation , 1996, International Journal of Computer Vision.

[34] Wolfgang Nejdl,et al. Introduction to the special section on twitter and microblogging services , 2013, TIST.

[35] Huchuan Lu,et al. Robust Object Tracking via Sparse Collaborative Appearance Model , 2014, IEEE Transactions on Image Processing.

[36] Hongming Zhou,et al. Extreme Learning Machine for Regression and Multiclass Classification , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[37] Yaonan Wang,et al. Bidirectional Extreme Learning Machine for Regression Problem and Its Learning Effectiveness , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[38] Rama Chellappa,et al. Visual tracking and recognition using appearance-adaptive models in particle filters , 2004, IEEE Transactions on Image Processing.

[39] Rui Caseiro,et al. High-Speed Tracking with Kernelized Correlation Filters , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40] Dong Yi,et al. Robust Online Learned Spatio-Temporal Context Model for Visual Tracking , 2014, IEEE Transactions on Image Processing.

[41] Shengping Zhang,et al. Sparse coding based visual tracking: Review and experimental comparison , 2013, Pattern Recognit..

[42] Shuicheng Yan,et al. Robust Object Tracking with Online Multi-lifespan Dictionary Learning , 2013, 2013 IEEE International Conference on Computer Vision.

[43] Meng Wang,et al. Self-Supervised Video Hashing With Hierarchical Binary Auto-Encoder , 2018, IEEE Transactions on Image Processing.

[44] Zhongfei Zhang,et al. A survey of appearance models in visual object tracking , 2013, ACM Trans. Intell. Syst. Technol..

[45] Shai Avidan,et al. Ensemble Tracking , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46] Dorin Comaniciu,et al. Kernel-Based Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[47] Tobias Bjerregaard,et al. A survey of research and practices of Network-on-chip , 2006, CSUR.

[48] Nicu Sebe,et al. Optimized Graph Learning Using Partial Tags and Multiple Features for Image and Video Annotation , 2016, IEEE Transactions on Image Processing.

[49] Guang-Bin Huang,et al. Trends in extreme learning machines: A review , 2015, Neural Networks.

[50] Gang Wang,et al. Video tracking using learned hierarchical features. , 2015, IEEE transactions on image processing : a publication of the IEEE Signal Processing Society.

[51] Anton van den Hengel,et al. Fast Global Kernel Density Mode Seeking: Applications to Localization and Tracking , 2007, IEEE Transactions on Image Processing.

[52] Lei Zhang,et al. Real-Time Compressive Tracking , 2012, ECCV.

[53] Jiri Matas,et al. P-N learning: Bootstrapping binary classifiers by structural constraints , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[54] Lei Xie,et al. An ensemble of deep neural networks for object tracking , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[55] Wenzhong Guo,et al. Land-Use Classification via Extreme Learning Classifier Based on Deep Convolutional Features , 2017, IEEE Geoscience and Remote Sensing Letters.