Suivi d'objets d'intérêt dans une séquence d'images : des points saillants aux mesures statistiques

Le probleme du suivi d'objets dans une video se pose dans des domaines tels que la vision par ordinateur (video-surveillance par exemple) et la post-production televisuelle et cinematographique (effets speciaux). Il se decline en deux variantes principales : le suivi d'une region d'interet, qui designe un suivi grossier d'objet, et la segmentation spatio-temporelle, qui correspond a un suivi precis des contours de l'objet d'interet. Dans les deux cas, la region ou l'objet d'interet doivent avoir ete prealablement detoures sur la premiere, et eventuellement la derniere, image de la sequence video. Nous proposons dans cette these une methode pour chacun de ces types de suivi ainsi qu'une implementation rapide tirant partie du Graphics Processing Unit (GPU) d'une methode de suivi de regions d'interet developpee par ailleurs. La premiere methode repose sur l'analyse de trajectoires temporelles de points saillants et realise un suivi de regions d'interet. Des points saillants (typiquement des lieux de forte courbure des lignes isointensite) sont detectes dans toutes les images de la sequence. Les trajectoires sont construites en liant les points des images successives dont les voisinages sont coherents. Notre contribution reside premierement dans l'analyse des trajectoires sur un groupe d'images, ce qui ameliore la qualite d'estimation du mouvement. De plus, nous utilisons une ponderation spatio-temporelle pour chaque trajectoire qui permet d'ajouter une contrainte temporelle sur le mouvement tout en prenant en compte les deformations geometriques locales de l'objet ignorees par un modele de mouvement global. La seconde methode realise une segmentation spatio-temporelle. Elle repose sur l'estimation du mouvement du contour de l'objet en s'appuyant sur l'information contenue dans une couronne qui s'etend de part et d'autre de ce contour. Cette couronne nous renseigne sur le contraste entre le fond et l'objet dans un contexte local. C'est la notre premiere contribution. De plus, la mise en correspondance par une mesure de similarite statistique, a savoir l'entropie du residuel, d'une portion de la couronne et d'une zone de l'image suivante dans la sequence permet d'ameliorer le suivi tout en facilitant le choix de la taille optimale de la couronne. Enfin, nous proposons une implementation rapide d'une methode de suivi de regions d'interet existante. Cette methode repose sur l'utilisation d'une mesure de similarite statistique : la divergence de Kullback-Leibler. Cette divergence peut etre estimee dans un espace de haute dimension a l'aide de multiples calculs de distances au k-eme plus proche voisin dans cet espace. Ces calculs etant tres couteux, nous proposons une implementation parallele sur GPU (grâce a l'interface logiciel CUDA de NVIDIA) de la recherche exhaustive des k plus proches voisins. Nous montrons que cette implementation permet d'accelerer le suivi des objets, jusqu'a un facteur 15 par rapport a une implementation de cette recherche necessitant au prealable une structuration des donnees.

[1]  Michel Barlaud,et al.  Motion and Appearance Nonparametric Joint Entropy for Video Segmentation , 2008, International Journal of Computer Vision.

[2]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Dorin Comaniciu,et al.  Real-time tracking of non-rigid objects using mean shift , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[4]  Frank Nielsen,et al.  Statistical region merging , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  G. Marchal,et al.  Multi-modal volume registration by maximization of mutual information , 1997 .

[6]  Michael J. Black,et al.  The Robust Estimation of Multiple Motions: Parametric and Piecewise-Smooth Flow Fields , 1996, Comput. Vis. Image Underst..

[7]  Dorin Comaniciu,et al.  Kernel-Based Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Tony Lindeberg,et al.  Feature Detection with Automatic Scale Selection , 1998, International Journal of Computer Vision.

[9]  Michel Barlaud,et al.  Contour tracking for rotoscoping based on trajectories of feature points , 2006 .

[10]  Cordelia Schmid,et al.  Comparing and evaluating interest points , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[11]  Mohammed Ghanbari,et al.  The Cross-Search Algorithm for Motion Estimation , 1990 .

[12]  S. Kullback,et al.  Information Theory and Statistics , 1959 .

[13]  Rachid Deriche,et al.  Differential invariants for color images , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[14]  Michel Barlaud,et al.  Using Neighborhood Distributions of Wavelet Coefficients for On-the-Fly, Multiscale-Based Image Retrieval , 2008, 2008 Ninth International Workshop on Image Analysis for Multimedia Interactive Services.

[15]  Dorin Comaniciu,et al.  Mean shift analysis and applications , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[16]  Michel Barlaud,et al.  Deterministic edge-preserving regularization in computed imaging , 1997, IEEE Trans. Image Process..

[17]  Renaud Keriven,et al.  Spiking Neurons on GPUs , 2006, International Conference on Computational Science.

[18]  Alex M. Andrew,et al.  Level Set Methods and Fast Marching Methods: Evolving Interfaces in Computational Geometry, Fluid Mechanics, Computer Vision, and Materials Science (2nd edition) , 2000 .

[19]  Hichem Sahbi,et al.  Graph laplacian for interactive image retrieval , 2008, ICASSP.

[20]  Anthony J. Yezzi,et al.  Curve evolution implementation of the Mumford-Shah functional for image segmentation, denoising, interpolation, and magnification , 2001, IEEE Trans. Image Process..

[21]  N. Ayache,et al.  An efficient locally affine framework for the smooth registration of anatomical structures , 2008, Medical Image Anal..

[22]  Michel Barlaud,et al.  High-dimensional statistical distance for region-of-interest tracking: Application to combining a soft geometric constraint with radiometry , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Michel Barlaud,et al.  Outer-Layer Based Tracking using Entropy as a Similarity Measure , 2007, 2007 IEEE International Conference on Image Processing.

[24]  Théodore Papadopoulo,et al.  Combinatorial Optimization for Electrode Labeling of EEG Caps , 2007, MICCAI.

[25]  Soontorn Oraintara,et al.  Complexity comparison of fast block-matching motion estimation algorithms , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[26]  Renaud Keriven,et al.  GPU-Cuts : Combinatorial Optimisation , Graphic Processing Units and Adaptive Object Extraction GPU-Cuts : Segmentation d ’ Objects par Optimisation Combinatoire sur Processeur Graphique , 2005 .

[27]  Sylvain Boltz,et al.  A statistical framework in variational methods of image and video processing problems with high dimensions. (Un cadre statistique en traitement d'images et vidéos par approche variationnelle avec modélisation haute dimension) , 2008 .

[28]  Cordelia Schmid,et al.  Appariement d'images par invariants locaux de niveaux de gris. Application à l'indexation d'une base d'objets. (Image matching by local greyvalue invariants. Applied to indexing an object database) , 1996 .

[29]  Hans P. Moravec Towards Automatic Visual Obstacle Avoidance , 1977, IJCAI.

[30]  Benoit M. Macq,et al.  Active Contour Attracted by a Reference Contour: A Region-Based Approach , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[31]  Donald L. Shell,et al.  A high-speed sorting procedure , 1959, CACM.

[32]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[33]  Julio Gonzalo,et al.  Interactive Image Retrieval , 2010, ImageCLEF.

[34]  Josiane Zerubia,et al.  Higher order active contours and their application to the detection of line networks in satellite imagery. , 2003, ICCV 2003.

[35]  Sunil Arya,et al.  An optimal algorithm for approximate nearest neighbor searching fixed dimensions , 1998, JACM.

[36]  Eric Debreuve,et al.  Méthode de suivi d`objets basée sur des trajectoires temporelles de points d`intérêt , 2007 .

[37]  Ramesh C. Jain,et al.  On the Analysis of Accumulative Difference Pictures from Image Sequences of Real World Scenes , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  T. Remi,et al.  Probabilistic Matching Algorithm for Keypoint Based Object Tracking Using a Delaunay Triangulation , 2007, Eighth International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS '07).

[39]  Jenny Benois-Pineau,et al.  PCA-Based Magnetic Field Modeling : Application for On-Line MR Temperature Monitoring , 2007, MICCAI.

[40]  M. N. Goria,et al.  A new class of random vector entropy estimators and its applications in testing statistical hypotheses , 2005 .

[41]  Christian Gourieroux,et al.  Statistics and econometric models , 1995 .

[42]  P. Mahalanobis On the generalized distance in statistics , 1936 .

[43]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[44]  Hans P. Moravec Obstacle avoidance and navigation in the real world by a seeing robot rover , 1980 .

[45]  Larry D. Hostetler,et al.  The estimation of the gradient of a density function, with applications in pattern recognition , 1975, IEEE Trans. Inf. Theory.

[46]  Frank Nielsen Visual computing : geometry, graphics, and vision , 2005 .

[47]  Xin Li,et al.  Contour-based object tracking with occlusion handling in video acquired using mobile cameras , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  Piotr Indyk,et al.  Nearest Neighbors in High-Dimensional Spaces , 2004, Handbook of Discrete and Computational Geometry, 2nd Ed..

[49]  Bing Zeng,et al.  A new three-step search algorithm for block motion estimation , 1994, IEEE Trans. Circuits Syst. Video Technol..

[50]  Eric Wolsztynski,et al.  Critère d'entropie pour l'estimation semi-paramétrique , 2006 .

[51]  François Tonnin Description locale d'images fixes dans le domaine compressé. (Local image description in the compressed domain) , 2006 .

[52]  L. Gool,et al.  Probabilistic object tracking using multiple features , 2004, ICPR 2004.

[53]  Maneesh Agrawala,et al.  Interactive video cutout , 2005, SIGGRAPH 2005.

[54]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[55]  C. Schmid,et al.  Indexing based on scale invariant interest points , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[56]  Cordelia Schmid,et al.  Local Grayvalue Invariants for Image Retrieval , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[57]  Tony F. Chan,et al.  A Multiphase Level Set Framework for Image Segmentation Using the Mumford and Shah Model , 2002, International Journal of Computer Vision.

[58]  Jenny Benois-Pineau,et al.  DAG-based visual interfaces for navigation in indexed video content , 2006, Multimedia Tools and Applications.

[59]  S. Osher,et al.  Algorithms Based on Hamilton-Jacobi Formulations , 1988 .

[60]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[61]  Marc Gelgon,et al.  Human detection and tracking for video surveillance applications in a low-density environment , 2003, Visual Communications and Image Processing.

[62]  Rachid Deriche,et al.  Using Canny's criteria to derive a recursively implemented optimal edge detector , 1987, International Journal of Computer Vision.

[63]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[64]  P.K Sahoo,et al.  A survey of thresholding techniques , 1988, Comput. Vis. Graph. Image Process..

[65]  Junaed Sattar Snakes , Shapes and Gradient Vector Flow , 2022 .

[66]  Michael J. Swain,et al.  Color indexing , 1991, International Journal of Computer Vision.

[67]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[68]  Michel Barlaud,et al.  Tracking based on local motion estimation of spatio-temporally weighted salient points , 2007, Eighth International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS '07).

[69]  S. M. Steve SUSAN - a new approach to low level image processing , 1997 .

[70]  Wlodzimierz Dobosiewicz An Efficient Variation of Bubble Sort , 1980, Inf. Process. Lett..

[71]  Tony Lindeberg,et al.  Detecting salient blob-like image structures and their scales with a scale-space primal sketch: A method for focus-of-attention , 1993, International Journal of Computer Vision.

[72]  Stanley T. Birchfield,et al.  Elliptical head tracking using intensity gradients and color histograms , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[73]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[74]  James L. Crowley,et al.  A Representation for Shape Based on Peaks and Ridges in the Difference of Low-Pass Transform , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[75]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[76]  Ishwar K. Sethi,et al.  Finding Trajectories of Feature Points in a Monocular Image Sequence , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[77]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[78]  Kai-Kuang Ma,et al.  A new diamond search algorithm for fast block-matching motion estimation , 2000, IEEE Trans. Image Process..

[79]  B. Ripley,et al.  Robust Statistics , 2018, Encyclopedia of Mathematical Geosciences.

[80]  Alex Pentland,et al.  Pfinder: real-time tracking of the human body , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[81]  Hichem Sahbi,et al.  Object recognition and retrieval by context dependent similarity kernels , 2008, 2008 International Workshop on Content-Based Multimedia Indexing.

[82]  Guillermo Sapiro,et al.  Geodesic Active Contours , 1995, International Journal of Computer Vision.

[83]  Michel Barlaud,et al.  DREAM2S: Deformable Regions Driven by an Eulerian Accurate Minimization Method for Image and Video Segmentation , 2002, ECCV.

[84]  John C. Nash,et al.  The (Dantzig) simplex method for linear programming , 2000, Comput. Sci. Eng..

[85]  Feng Wu,et al.  Very Fast Template Matching , 2002, ECCV.

[86]  Luc Pronzato,et al.  A Minimum-Entropy Procedure for Robust Motion Estimation , 2006, 2006 International Conference on Image Processing.

[87]  Anthony J. Yezzi,et al.  Gradient flows and geometric active contour models , 1995, Proceedings of IEEE International Conference on Computer Vision.

[88]  Cordelia Schmid,et al.  Evaluation of Interest Point Detectors , 2000, International Journal of Computer Vision.

[89]  Frank Nielsen,et al.  Approximating Smallest Enclosing Balls with Applications to Machine Learning , 2009, Int. J. Comput. Geom. Appl..

[90]  Bernt Schiele,et al.  Object Recognition Using Multidimensional Receptive Field Histograms , 1996, ECCV.

[91]  John W. Fisher,et al.  Nonparametric methods for image segmentation using information theory and curve evolution , 2002, Proceedings. International Conference on Image Processing.

[92]  Lap-Pui Chau,et al.  Hexagon-based search pattern for fast block motion estimation , 2002, IEEE Trans. Circuits Syst. Video Technol..

[93]  Demetri Terzopoulos,et al.  Snakes: Active contour models , 2004, International Journal of Computer Vision.

[94]  G.E. Moore,et al.  Cramming More Components Onto Integrated Circuits , 1998, Proceedings of the IEEE.

[95]  David J. Hawkes,et al.  Incorporating connected region labelling into automated image registration using mutual information , 1996, Proceedings of the Workshop on Mathematical Methods in Biomedical Image Analysis.

[96]  Vincent Kanade,et al.  Clustering Algorithms , 2021, Wireless RF Energy Transfer in the Massive IoT Era.

[97]  Michel Barlaud,et al.  Image retrieval via Kullback-Leibler divergence of patches of multiscale coefficients in the KNN framework , 2008, 2008 International Workshop on Content-Based Multimedia Indexing.

[98]  Luc Pronzato,et al.  Minimum Entropy Estimators in Semiparametric Regression Problems , 2005 .

[99]  T Koga,et al.  MOTION COMPENSATED INTER-FRAME CODING FOR VIDEO CONFERENCING , 1981 .

[100]  Michel Barlaud,et al.  Combining shape prior and statistical features for active contour segmentation , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[101]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[102]  N. Yamashita,et al.  CLAIRVOYANCE: A Fast And Robust Precision Mosaicing System for Gigapixel Images , 2006, IECON 2006 - 32nd Annual Conference on IEEE Industrial Electronics.

[103]  Michel Barlaud,et al.  A Contour Tracking Algorithm for Rotoscopy , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[104]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[105]  R. Mohr,et al.  Image retrieval using local characterization , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[106]  David Hinkley,et al.  Bootstrap Methods: Another Look at the Jackknife , 2008 .

[107]  Olivier D. Faugeras,et al.  Image Segmentation Using Active Contours: Calculus of Variations or Shape Gradients? , 2003, SIAM J. Appl. Math..

[108]  Michel Barlaud,et al.  Region-of-Interest Tracking Based on Keypoint Trajectories on a Group of Pictures , 2007, 2007 International Workshop on Content-Based Multimedia Indexing.

[109]  R. Deriche Recursively Implementing the Gaussian and its Derivatives , 1993 .

[110]  Michel Barlaud,et al.  Fast k nearest neighbor search using GPU , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[111]  Thomas Deneux,et al.  Sift-based sequence registration and flow-based cortical vessel segmentation applied to high resolution optical imaging data , 2008, ISBI.

[112]  Larry S. Davis,et al.  Probabilistic tracking in joint feature-spatial spaces , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[113]  Cor J. Veenman,et al.  Resolving Motion Correspondence for Densely Moving Points , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[114]  Verónica Vilaplana,et al.  Region-Based Hierarchical Representation for Object Detection , 2007, 2007 International Workshop on Content-Based Multimedia Indexing.

[115]  Jake K. Aggarwal,et al.  Segmentation and recognition of continuous human activity , 2001, Proceedings IEEE Workshop on Detection and Recognition of Events in Video.

[116]  A. Treisman Preattentive processing in vision , 1985, Comput. Vis. Graph. Image Process..

[117]  Tony F. Chan,et al.  Active contours without edges , 2001, IEEE Trans. Image Process..

[118]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[119]  Jenny Benois-Pineau,et al.  Retrieval of objects in video by similarity based on graph matching , 2007, Pattern Recognit. Lett..

[120]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[121]  Olivier Faugeras,et al.  Generalized Head Models for MEG / EEG : BEM beyond Nested Volumes , 2022 .

[122]  John A. Nelder,et al.  A Simplex Method for Function Minimization , 1965, Comput. J..

[123]  Jenny Benois-Pineau,et al.  Gaussian mixture classification for moving object detection in video surveillance environment , 2005, IEEE International Conference on Image Processing 2005.

[124]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[125]  Josiane Zerubia,et al.  Higher Order Active Contours , 2006, International Journal of Computer Vision.

[126]  Rachid Deriche,et al.  A Review of Statistical Approaches to Level Set Segmentation: Integrating Color, Texture, Motion and Shape , 2007, International Journal of Computer Vision.

[127]  F. Nielsen An Interactive Tour of Voronoi Diagrams on the GPU , 2007 .

[128]  Moon Gi Kang,et al.  Super-resolution image reconstruction: a technical overview , 2003, IEEE Signal Process. Mag..

[129]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[130]  Ibrahim A. Ahmad,et al.  A nonparametric estimation of the entropy for absolutely continuous distributions (Corresp.) , 1976, IEEE Trans. Inf. Theory.

[131]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[132]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[133]  A. Dervieux,et al.  A finite element method for the simulation of a Rayleigh-Taylor instability , 1980 .