Compressed spatio-temporal descriptors for video matching and retrieval

The contents of a video can be described in terms of appearance and motion of the scenes. In this paper, we propose a compressed spatio-temporal descriptor that is suitable for video matching and retrieval tasks. We use a modified wavelet based compression technique that exploits the temporal redundancy of the data using optical flow. In order to achieve a compact flow representation, a spline based technique is used. The optical flow field gives the directions along which the gray levels have regular variations in time. Wavelet decomposition along these directions results in fewer coefficients and thus higher compression. We demonstrate that the wavelet coefficients and flow parameters can be efficiently used for 1) video retrieval and matching, and 2) calculating spatio-temporal similarity between articulated objects. The results are demonstrated on several sequences.

[1]  Patrick Bouthemy,et al.  Statistical motion-based object indexing using optic flow field , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[2]  Alberto Del Bimbo,et al.  Video retrieval based on dynamics of color flows , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[3]  M. La Cascia,et al.  Motion and color-based video indexing and retrieval , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[4]  Richard Szeliski,et al.  Spline-Based Image Registration , 1997, International Journal of Computer Vision.

[5]  David S. Taubman,et al.  Highly scalable video compression with scalable motion coding , 2003, ICIP.

[6]  José Duato,et al.  Efficient 3D wavelet transform decomposition for video compression , 2001, Proceedings Second International Workshop on Digital and Computational Video.

[7]  Jitendra Malik,et al.  Recognizing action at a distance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.