Near-duplicate detection and alignment for multi-view videos

The increasing popularity of video sharing platforms (e.g., YouTube, Vimeo, etc.) has determined the widespread diffusion of near-duplicate videos, i.e., sequences obtained applying different editing operations to the same original clip. However, it is also possible to come across sequences referring to the same specific event shot from different viewpoints. This is a very common situation that arises when analyzing user-generated content acquired with mobile devices. Therefore, for some applications, it can be useful to extend the concept of near-duplicates considering also all the videos (and their edited versions) referring to the same event even if shot from different viewpoints. In this paper we consider such challenging scenario. More specifically, we focus on the problem of multi-view near-duplicate video detection and temporal alignment. In doing so, we show the limitations of a state-of-the-art algorithm based on robust hashing, and propose a processing pipeline that allows to deal also with sequences taken from significantly different viewpoints.

[1]  Guilherme A. S. Pereira,et al.  Temporal synchronization of non-overlapping videos using known object motion , 2011, Pattern Recognit. Lett..

[2]  Lior Wolf,et al.  Wide Baseline Matching between Unsynchronized Video Sequences , 2006, International Journal of Computer Vision.

[3]  Anil C. Kokaram,et al.  Synchronization of user-generated videos through trajectory correspondence and a refinement procedure , 2013, CVMP '13.

[4]  Paolo Bestagini,et al.  Image phylogeny through dissimilarity metrics fusion , 2014, 2014 5th European Workshop on Visual Information Processing (EUVIP).

[5]  Stefano Tubaro,et al.  A phylogenetic analysis of near-duplicate audio tracks , 2013, 2013 IEEE 15th International Workshop on Multimedia Signal Processing (MMSP).

[6]  Mrinal K. Mandal,et al.  A Robust Technique for Motion-Based Video Sequences Temporal Alignment , 2013, IEEE Transactions on Multimedia.

[7]  Andrea Cavallaro,et al.  Action-Based Multi-Camera Synchronization , 2013, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[8]  Anderson Rocha,et al.  Image Phylogeny by Minimal Spanning Trees , 2012, IEEE Transactions on Information Forensics and Security.

[9]  Mauro Barni,et al.  Exploring image dependencies: a new challenge in image forensics , 2010, Electronic Imaging.

[10]  Andrea Fusiello,et al.  Quasi-Euclidean uncalibrated epipolar rectification , 2008, 2008 19th International Conference on Pattern Recognition.

[11]  Richard I. Hartley,et al.  Theory and Practice of Projective Rectification , 1999, International Journal of Computer Vision.

[12]  Kiriakos N. Kutulakos,et al.  Linear Sequence-to-Sequence Alignment , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Paolo Bestagini,et al.  Phylogeny reconstruction for misaligned and compressed video sequences , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[14]  A. Piva An Overview on Image Forensics , 2013 .

[15]  Paolo Bestagini,et al.  An overview on video forensics , 2012, 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO).

[16]  Heiko Hirschmüller,et al.  Stereo Processing by Semiglobal Matching and Mutual Information , 2008, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Paolo Bestagini,et al.  Multi-Clue Image Tampering Localization , 2014, 2014 IEEE International Workshop on Information Forensics and Security (WIFS).

[18]  Anderson Rocha,et al.  Video Phylogeny: Recovering near-duplicate video relationships , 2011, 2011 IEEE International Workshop on Information Forensics and Security.

[19]  Anderson Rocha,et al.  First steps toward image phylogeny , 2010, 2010 IEEE International Workshop on Information Forensics and Security.

[20]  Andrea Cavallaro,et al.  Discovery and organization of multi-camera user-generated videos of the same event , 2015, Inf. Sci..

[21]  S. Goldenstein,et al.  Toward image phylogeny forests: automatically recovering semantically similar image relationships. , 2013, Forensic science international.

[22]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[23]  Mauro Barbieri,et al.  Synchronization of multi-camera video recordings based on audio , 2007, ACM Multimedia.

[24]  Paolo Bestagini,et al.  Who is my parent? Reconstructing video sequences from partially matching shots , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[25]  Anderson Rocha,et al.  Image Phylogeny Forests Reconstruction , 2014, IEEE Transactions on Information Forensics and Security.

[26]  Andrea Cavallaro,et al.  Audio-visual events for multi-camera synchronization , 2015, Multimedia Tools and Applications.

[27]  Luc Van Gool,et al.  A mobile vision system for robust multi-person tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Charles T. Loop,et al.  Computing rectifying homographies for stereo vision , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).