An Improvement of Kernel-Based Object Tracking Based on Human Perception

The objective of the paper is to embed perception rules into the kernel-based target tracking algorithm and to evaluate to what extent these rules are able to improve the original tracking algorithm, without any additional computational cost. To this aim, the target is represented through features that are related to its visual appearance; then, it is tracked in subsequent frames using a metric that, again, correlates well with the human visual perception (HVP). The use of HVP rules are twofold advantageous: it allows us to both increase tracking efficacy and considerably reduce the computational cost of the tracking process-thanks to the reduced size of the perceptual feature space. Various tests on video sequences have shown the stability and the robustness of the proposed framework, also in the presence of both other moving objects and partial or complete target occlusion in a limited number of subsequent frames.

[1]  Luis Filipe Coelho Antunes,et al.  Entropy Measures vs. Kolmogorov Complexity , 2011, Entropy.

[2]  Robert T. Collins,et al.  Mean-shift blob tracking through scale space , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[3]  Stanley T. Birchfield,et al.  Spatiograms versus histograms for region-based tracking , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[4]  Larry S. Davis,et al.  Efficient mean-shift tracking via a new similarity measure , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  Zhou Wang,et al.  Foveation scalable video coding with automatic fixation selection , 2003, IEEE Trans. Image Process..

[6]  B.J.A. Kröse,et al.  A probabilistic model for an EM-like object tracking algorithm using color histograms , 2004 .

[7]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[8]  Ramin Zabih,et al.  The 30th Anniversary of the IEEE Transactions on Pattern Analysis and Machine Intelligence , 2010, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Hiroshi Murase,et al.  Human Re-identification through Distance Metric Learning based on Jensen-Shannon Kernel , 2012, VISAPP.

[10]  Paul A. Viola,et al.  Alignment by Maximization of Mutual Information , 1997, International Journal of Computer Vision.

[11]  Anil Kokaram,et al.  Advances in Automated Restoration of Archived Video , 2011 .

[12]  Gregory Hager,et al.  Multiple kernel tracking with SSD , 2004, CVPR 2004.

[13]  Mateu Sbert,et al.  Selection and 3D visualization of video key frames , 2010, 2010 IEEE International Conference on Systems, Man and Cybernetics.

[14]  D. Zhang,et al.  Scale and orientation adaptive mean shift tracking , 2012 .

[15]  Zhou Wang,et al.  Image distortion analysis based on normalized perceptual information distance , 2013, Signal Image Video Process..

[16]  Vittoria Bruni,et al.  Perceptual object tracking , 2012, 2012 IEEE Workshop on Biometric Measurements and Systems for Security and Medical Applications (BIOMS) Proceedings.

[17]  Ming Li,et al.  Clustering by compression , 2003, IEEE International Symposium on Information Theory, 2003. Proceedings..

[18]  Zhang Guo-xuan,et al.  Enhanced mean shift tracking algorithm based on evolutive asymmetric kernel , 2011, 2011 International Conference on Multimedia Technology.

[19]  Sheila S. Hemami,et al.  VSNR: A Wavelet-Based Visual Signal-to-Noise Ratio for Natural Images , 2007, IEEE Transactions on Image Processing.

[20]  P. Gács,et al.  KOLMOGOROV'S CONTRIBUTIONS TO INFORMATION THEORY AND ALGORITHMIC COMPLEXITY , 1989 .

[21]  Alper Yilmaz,et al.  Object Tracking by Asymmetric Kernel Mean Shift with Automatic Scale and Orientation Selection , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Ioannis Pitas,et al.  Information theory-based shot cut/fade detection and video summarization , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[23]  Paul A. Viola,et al.  Multi-modal volume registration by maximization of mutual information , 1996, Medical Image Anal..

[24]  Guy Marchal,et al.  Multimodality image registration by maximization of mutual information , 1997, IEEE Transactions on Medical Imaging.

[25]  Eero P. Simoncelli,et al.  Natural image statistics and neural representation. , 2001, Annual review of neuroscience.

[26]  Lina J. Karam,et al.  Adaptive image coding with perceptual distortion control , 2002, IEEE Trans. Image Process..

[27]  Giovanni Ramponi,et al.  Image quality assessment through a subset of the image data , 2011, 2011 7th International Symposium on Image and Signal Processing and Analysis (ISPA).

[28]  B. Kröse,et al.  An EM-like algorithm for color-histogram-based object tracking , 2004, CVPR 2004.

[29]  Hai Tao,et al.  Object Tracking using Color Correlogram , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[30]  Robert A. Frazor,et al.  Local luminance and contrast in natural images , 2006, Vision Research.

[31]  A. Bovik,et al.  Visual search in noise: revealing the influence of structural cues by gaze-contingent classification image analysis. , 2006, Journal of vision.

[32]  Stefan Winkler,et al.  Digital Video Quality: Vision Models and Metrics , 2005 .

[33]  Alan C. Bovik,et al.  A comparison of foveated acquisition and tracking performance relative to uniform resolution approaches , 2009, Defense + Commercial Sensing.

[34]  Alan C. Bovik,et al.  Foveated Visual Search for Corners , 2007, IEEE Transactions on Image Processing.

[35]  M. N. Channabasappa On the Square Root Formula in the Bakhshali Manuscript , 1976 .

[36]  Alan C. Bovik,et al.  Image information and visual quality , 2006, IEEE Trans. Image Process..

[37]  Bin Ma,et al.  The similarity metric , 2001, IEEE Transactions on Information Theory.

[38]  Ling Shao,et al.  Recent advances and trends in visual tracking: A review , 2011, Neurocomputing.

[39]  Dorin Comaniciu,et al.  Kernel-Based Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[40]  Bo Thiesson,et al.  Image and Video Segmentation by Anisotropic Kernel Mean Shift , 2004, ECCV.

[41]  Robert A. Frazor,et al.  Independence of luminance and contrast in natural scenes and in the early visual system , 2005, Nature Neuroscience.

[42]  Zhou Wang,et al.  Special issue on human vision and information theory , 2013, Signal Image Video Process..

[43]  A. Kolmogorov Three approaches to the quantitative definition of information , 1968 .

[44]  H. Basford,et al.  Optimal eye movement strategies in visual search , 2005 .

[45]  Dorin Comaniciu,et al.  Real-time tracking of non-rigid objects using mean shift , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[46]  A. Bovik,et al.  An efficient technique for revealing visual search strategies with classification images , 2007, Perception & psychophysics.

[47]  John J. Soraghan,et al.  An improved Mean Shift tracker with fast failure recovery strategy after complete occlusion , 2011, 2011 8th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[48]  Xinsheng Huang,et al.  An improved mean-shift tracking algorithm with spatial-color feature and new similarity measure , 2011, 2011 International Conference on Multimedia Technology.

[49]  Qi Zhao,et al.  Differential Earth Mover's Distance with Its Applications to Visual Tracking , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[50]  Vittoria Bruni,et al.  A generalized model for scratch detection , 2004, IEEE Transactions on Image Processing.

[51]  Vittoria Bruni,et al.  Evaluation of degraded images using adaptive Jensen-Shannon divergence , 2013, 2013 8th International Symposium on Image and Signal Processing and Analysis (ISPA).

[52]  David A. Clausi,et al.  Efficient Target Recovery Using STAGE for Mean-shift Tracking , 2009, 2009 Canadian Conference on Computer and Robot Vision.

[53]  Benedetto Piccoli,et al.  A fast computation method for time scale signal denoising , 2009, Signal Image Video Process..

[54]  Alan C Bovik,et al.  Contrast statistics for foveated visual systems: fixation selection by minimizing contrast entropy. , 2005, Journal of the Optical Society of America. A, Optics, image science, and vision.

[55]  Vittoria Bruni,et al.  On the Equivalence Between Jensen–Shannon Divergence and Michelson Contrast , 2012, IEEE Transactions on Information Theory.

[56]  Mateu Sbert,et al.  Compression-based Image Registration , 2006, 2006 IEEE International Symposium on Information Theory.

[57]  Andrew B. Watson,et al.  DCTune: A TECHNIQUE FOR VISUAL OPTIMIZATION OF DCT QUANTIZATION MATRICES FOR INDIVIDUAL IMAGES. , 1993 .

[58]  Vittoria Bruni,et al.  Jensen–Shannon divergence for visual quality assessment , 2013, Signal Image Video Process..

[59]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[60]  Jorma Rissanen,et al.  Information and Complexity in Statistical Modeling , 2006, ITW.

[61]  Mateu Sbert,et al.  High-Dimensional Normalized Mutual Information for Image Registration Using Random Lines , 2006, WBIR.

[62]  Jwu-Sheng Hu,et al.  A spatial-color mean-shift object tracking algorithm with scale and orientation estimation , 2008, Pattern Recognit. Lett..