Compensación de movimiento avanzada para codificación de vídeo

Los nuevos esquemas de codificacion de video predictivos incluyen con mayor frecuencia tecnicas que abren posibilidades nuevas para funcionalidades avanzadas de codificacion. Para ese fin, los contenidos de video estan segmentados en regiones que despues pueden ser procesadas por separado y asi tratadas de forma independiente en aplicaciones sucesivas. Sin embargo, la mayoria de los esquemas, sobre todo los estandares, todavia aplican tecnicas basadas en pixeles, como por ejemplo prediccion de movimiento y codificacion por transformada basados en bloques, que podrian ser sustituidas por metodos mas sofisticados para alcanzar mejores calidades de codificacion, especialmente referido a la calidad subjetiva. Una meta de este trabajo era la implementacion de un codificador de video basado en el esquema clasico, donde la parte de prediccion de movimiento es substituida por una tecnica basada en mallas triangulares irregulares y la parte de codificacion por transformada es sustituida por una transformacion de ondiculas. Estas son dos partes muy importantes del proceso de codificacion de video que han sido investigadas por varios autores de forma separada pero nunca combinadamente. Basado en el codificador presentado, se ha enfocado ademas en el procesamiento de regiones, es decir en las posibilidades de procesar objetos (para conseguir las mismas funcionalidades de interaccion con el usuario que las propuestas por el estandar MPEG-4) y otros tipos especiales de regiones, con el fin de codificarlos de forma preferente respecto al resto de la imagen. Aqui se refiere por un lado a regiones de interes especial para el espectador, como por ejemplo regiones faciales que podrian ser codificadas con mejor calidad que el resto de la imagen (por ejemplo para mantener la calidad en esa region constante en casos de tasa baja binaria). Por otro lado, se selecciona regiones con alto error de prediccion que, cuando se codifica con preferencia, pueden elevar la calidad de la imagen considerablemente. Con la intencion de encontrar siempre el metodo mas eficiente y computacionalmente menos costoso, varias ideas para el diseno de mallas adaptativas al contenido de la imagen, entre otras una tecnica basada en caracteristicas faciales, han sido investigadas en la parte de prediccion de movimiento. En la parte de codificacion por transformada, se ha aplicado la transformacion por ondiculas adaptativa a formas arbitrarias, que ha sido adoptada por MPEG-4 en la parte de codificacion de imagenes fijas, a la transformacion de las regiones anteriormente mencionadas y objetos en el sentido comun. Finalmente, la codificacion por preferencias ha sido conseguida con diferentes algoritmos de distribucion binaria que han sido desarrollados especialmente para el procesamiento de multiples regiones. Los resultados demuestran que la combinacion de una tecnica de prediccion de movimiento basada en mallas con la transformacion por ondiculas es mas adecuada para conseguir mejoras de calidad de codificacion objetivas y subjetivas, que cada una de estas tecnicas aplicada en combinacion con procesos basados en bloques. En general, los resultados objetivos y subjetivos son comparables con los de otros esquemas y estandares, incluso superandolos en ciertas situaciones. Ademas, los metodos de procesamiento aplicadas a regiones han conseguido muy buenos resultados, no solamente en cuanto a calidad, sino sobre todo en cuanto a funcionalidad, es decir, se ha podido crear con ellos nuevas herramientas utiles.

[1]  Béatrice Pesquet-Popescu,et al.  Embedded color coding for scalable 3D wavelet video compression , 2000, Visual Communications and Image Processing.

[2]  Roland Mech,et al.  Combined description of shape and motion in an object based coding scheme using curved triangles , 1995, Proceedings., International Conference on Image Processing.

[3]  A. Murat Tekalp,et al.  Two-dimensional mesh-based mosaic representation for manipulation of video objects with occlusion , 2000, IEEE Trans. Image Process..

[4]  Nathalie Laurent,et al.  Limitation of triangles overlapping in mesh-based motion estimation using augmented Lagrangian , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[5]  Tihao Chiang,et al.  A new rate control scheme using quadratic rate distortion model , 1997, IEEE Trans. Circuits Syst. Video Technol..

[6]  E. J. Stollnitz,et al.  Wavelets for Computer Graphics: A Primer Part 2 , 1995 .

[7]  Roberto Brunelli,et al.  Face Recognition: Features Versus Templates , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Joseph O'Rourke,et al.  Computational Geometry in C. , 1995 .

[9]  Ioannis Pitas,et al.  Face localization and facial feature extraction based on shape and color information , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[10]  Hsueh-Ming Hang,et al.  Video coding algorithm based on image warping and nonrectangular DCT coding , 1997, Electronic Imaging.

[11]  Russell M. Mersereau,et al.  The symmetric convolution approach to the nonexpansive implementations of FIR filter banks for images , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12]  Joo-Hee Moon,et al.  Shape-adaptive region partitioning method for shape-assisted block-based texture coding , 1997, IEEE Trans. Circuits Syst. Video Technol..

[13]  Detlev Marpe,et al.  Wavelet-based very low bit-rate video coding using image warping and overlapped block motion compensation , 2001 .

[14]  Thomas Sikora,et al.  Shape-adaptive DCT for generic coding of video , 1995, IEEE Trans. Circuits Syst. Video Technol..

[15]  Zixiang Xiong,et al.  3-D wavelet coding of video with arbitrary regions of support , 1999, Conference Record of the Thirty-Third Asilomar Conference on Signals, Systems, and Computers (Cat. No.CH37020).

[16]  Kikuo Fujimura,et al.  Foldover-Free Image Warping , 1998, Graph. Model. Image Process..

[17]  Kenji Mase,et al.  Recognition of Facial Expression from Optical Flow , 1991 .

[18]  Murat Kunt,et al.  Video Coding: The Second Generation Approach , 2011 .

[19]  N. Laurent,et al.  Mesh based video coding at very low bitrate , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).

[20]  A. Murat Tekalp,et al.  Hierarchical 2-D mesh representation, tracking, and compression for object-based video , 1999, IEEE Trans. Circuits Syst. Video Technol..

[21]  Ho-Jin Lee,et al.  Detection of facial features based on the relaxation algorithm , 2000, Visual Communications and Image Processing.

[22]  Detlev Marpe,et al.  Video coding using a bilinear image warping motion model and wavelet-based residual coding , 1999, Optics & Photonics.

[23]  Zixiang Xiong,et al.  High-performance 3-D embedded wavelet video (EWV) coding , 2001, 2001 IEEE Fourth Workshop on Multimedia Signal Processing (Cat. No.01TH8564).

[24]  Peter Kauff,et al.  Functional coding of video using a shape-adaptive DCT algorithm and an object-based motion prediction toolbox , 1997, IEEE Trans. Circuits Syst. Video Technol..

[25]  Christian Roux,et al.  Triangular active mesh for motion estimation , 1997, Signal Process. Image Commun..

[26]  Verónica Vilaplana,et al.  Human face segmentation and tracking using connected operators and partition projection , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[27]  Tao Wang,et al.  Evaluation of mesh-based motion estimation in H.263-like coders , 1998, IEEE Trans. Circuits Syst. Video Technol..

[28]  Bernd Menser,et al.  Segmentation and tracking of facial regions in color image sequences , 2000, Visual Communications and Image Processing.

[29]  Narciso García,et al.  Evaluation of DWT and DCT for irregular mesh-based motion compensation in predictive video coding , 2000, Visual Communications and Image Processing.

[30]  A. Murat Tekalp,et al.  Closed-form connectivity-preserving solutions for motion compensation using 2-D meshes , 1997, IEEE Trans. Image Process..

[31]  King Ngi Ngan,et al.  Face segmentation using skin-color map in videophone applications , 1999, IEEE Trans. Circuits Syst. Video Technol..

[32]  Thomas Sikora,et al.  Efficiency of shape-adaptive 2-D transforms for coding of arbitrarily shaped image segments , 1995, IEEE Trans. Circuits Syst. Video Technol..

[33]  Thomas Sikora,et al.  Low complexity shape-adaptive DCT for coding of arbitrarily shaped image segments , 1995, Signal Process. Image Commun..

[34]  Kenneth E. Barner,et al.  Color-based classifier for region identification in video , 1998, Electronic Imaging.

[35]  George Wolberg,et al.  Digital image warping , 1990 .

[36]  Konstantinos N. Plataniotis,et al.  Content-based storage and retrieval scheme for image and video databases , 1998, Electronic Imaging.

[37]  Charles Poynton,et al.  Frequently Asked Questions about Color , 1997 .

[38]  Chung-Lin Huang,et al.  A new motion compensation method for image sequence coding using hierarchical grid interpolation , 1994, IEEE Trans. Circuits Syst. Video Technol..

[39]  Vicki Bruce,et al.  Face Recognition: From Theory to Applications , 1999 .

[40]  K. J. Ray Liu,et al.  An adaptive interpolation scheme for 2-D mesh motion compensation , 1997, Proceedings of International Conference on Image Processing.

[41]  William A. Pearlman,et al.  A new, fast, and efficient image codec based on set partitioning in hierarchical trees , 1996, IEEE Trans. Circuits Syst. Video Technol..

[42]  Shipeng Li,et al.  Arbitrarily shaped video-object coding by wavelet , 2001, IEEE Trans. Circuits Syst. Video Technol..

[43]  Barry G. Sherlock,et al.  Space-frequency balance in biorthogonal wavelets , 1997, Proceedings of International Conference on Image Processing.

[44]  Petri Haavisto,et al.  Temporal image sequence prediction using motion field interpolation , 1995, Signal Process. Image Commun..

[45]  Roberto Cipolla,et al.  Feature-based human face detection , 1997, Image Vis. Comput..

[46]  Georgios S. Paschos,et al.  Perceptually uniform color spaces for color texture analysis: an empirical evaluation , 2001, IEEE Trans. Image Process..

[47]  Takaaki Akimoto,et al.  Automatic creation of 3D facial models , 1993, IEEE Computer Graphics and Applications.

[48]  Itu-T Video coding for low bitrate communication , 1996 .

[49]  Fernando Jaureguizar,et al.  Object-oriented motion compensation for very low bitrate coding applying content-based triangle meshes , 1998, Electronic Imaging.

[50]  J. Nieweglowski,et al.  A novel video coding scheme based on temporal prediction using digital image warping , 1993, IEEE 1993 International Conference on Consumer Electronics Digest of Technical Papers.

[51]  Narciso García,et al.  Rate control and bit allocation for MPEG-4 , 1999, IEEE Trans. Circuits Syst. Video Technol..

[52]  H. Brusewitz,et al.  Motion compensation with triangles , 1990 .

[53]  Detlev Marpe,et al.  A wavelet-based video coding scheme using image warping prediction , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[54]  Ian H. Witten,et al.  Arithmetic coding for data compression , 1987, CACM.

[55]  Yehezkel Yeshurun,et al.  Robust detection of facial features by generalized symmetry , 1992, [1992] Proceedings. 11th IAPR International Conference on Pattern Recognition.

[56]  Jörn Ostermann,et al.  Automatic adaptation of a face model in a layered coder with an object-based analysis-synthesis layer and a knowledge-based layer , 1997, Signal Process. Image Commun..

[57]  Donald M. Monro,et al.  Zerotree coding of DCT coefficients , 1997, Proceedings of International Conference on Image Processing.

[58]  Hiroyuki Katata,et al.  Object wavelet transform for coding of arbitrarily shaped image segments , 1997, IEEE Trans. Circuits Syst. Video Technol..

[59]  S. Mallat A wavelet tour of signal processing , 1998 .

[60]  George Wolberg Nonuniform image reconstruction using multilevel surface interpolation , 1997, Proceedings of International Conference on Image Processing.

[61]  Laurenz Wiskott,et al.  Phantom faces for face analysis , 1997, Proceedings of International Conference on Image Processing.

[62]  Markus Kampmann,et al.  Precise Face Model Adaptation for Semantic Coding of Videophone Sequences , 1997 .

[63]  Liang Zhang Tracking a face for knowledge-based coding of videophone sequences , 1997, Signal Process. Image Commun..

[64]  Stéphane Mallat,et al.  A Theory for Multiresolution Signal Decomposition: The Wavelet Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[65]  A. Murat Tekalp,et al.  Optimal hierarchical design of 2D dynamic meshes for videos , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[66]  Munsell Color The munsell book of color , 1993 .

[67]  Chorng-Yann Su,et al.  Arbitrarily shaped image coding by using translation invariant wavelet transforms , 1999, Signal Process..

[68]  Kwanghoon Sohn,et al.  Error-resilient video coding technique based on wavelet transform , 2000, Visual Communications and Image Processing.

[69]  Edward J. Delp,et al.  Wavelet based rate scalable video compression , 1999, IEEE Trans. Circuits Syst. Video Technol..

[70]  Toshifumi Kanamaru,et al.  Block-based DCT and wavelet selective coding for arbitrarily shaped images , 1997, Electronic Imaging.

[71]  Shipeng Li,et al.  Shape-adaptive discrete wavelet transforms for arbitrarily shaped visual object coding , 2000, IEEE Trans. Circuits Syst. Video Technol..

[72]  Norbert Krüger,et al.  Face recognition by elastic bunch graph matching , 1997, Proceedings of International Conference on Image Processing.

[73]  K. J. Ray Liu,et al.  A low bit-rate video codec based on two-dimensional mesh motion compensation with adaptive interpolation , 2001, IEEE Trans. Circuits Syst. Video Technol..

[74]  Hiroshi Harashima,et al.  Model-based/waveform hybrid coding for videotelephone images , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[75]  A. Murat Tekalp,et al.  Optimal 2-D hierarchical content-based mesh design and update for object-based video , 2000, IEEE Trans. Circuits Syst. Video Technol..

[76]  André Kaup,et al.  Object-based texture coding of moving video in MPEG-4 , 1999, IEEE Trans. Circuits Syst. Video Technol..

[77]  Konstantinos N. Plataniotis,et al.  Automatic location and tracking of the facial region in color video sequences , 1999, Signal Process. Image Commun..

[78]  Mark J. T. Smith,et al.  Subband coding of images with octave band tree structures , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[79]  Hiroshi Harashima,et al.  Motion compensation based on spatial transformations , 1994, IEEE Trans. Circuits Syst. Video Technol..

[80]  Jörgen Ahlberg,et al.  CANDIDE-3 - An Updated Parameterised Face , 2001 .

[81]  Naohisa Ohta,et al.  Digital compression technologies and systems for video communications : 7-9 October 1996, Berlin, FRG , 1996 .

[82]  Yao Wang,et al.  Use of two-dimensional deformable mesh structures for video coding .I. The synthesis problem: mesh-based function approximation and mapping , 1996, IEEE Trans. Circuits Syst. Video Technol..

[83]  Jin Li,et al.  Arbitrary shape wavelet transform with phase alignment , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[84]  Shih-Fu Chang,et al.  A highly efficient system for automatic face region detection in MPEG video , 1997, IEEE Trans. Circuits Syst. Video Technol..

[85]  A. Murat Tekalp,et al.  2-D mesh-based tracking of deformable objects with occlusion , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[86]  Takeo Kanade,et al.  Neural Network-Based Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[87]  Yihong Gong,et al.  Detection of Regions Matching Specified Chromatic Features , 1995, Comput. Vis. Image Underst..

[88]  Jonathan Richard Shewchuk,et al.  Triangle: Engineering a 2D Quality Mesh Generator and Delaunay Triangulator , 1996, WACG.

[89]  Pierre Moulin,et al.  Switched control grid interpolation for motion compensated video coding , 1997, Proceedings of International Conference on Image Processing.

[90]  Benjamin Belzer,et al.  Wavelet filter evaluation for image compression , 1995, IEEE Trans. Image Process..

[91]  Michael G. Strintzis,et al.  Tracking textured deformable objects using a finite-element mesh , 1998, IEEE Trans. Circuits Syst. Video Technol..

[92]  D. Marpe,et al.  Complexity-constrained best-basis wavelet packet algorithm for image compression , 1998 .

[93]  Rui J. P. de Figueiredo,et al.  Simultaneous object segmentation, multiple object tracking and alpha map generation , 1997, Proceedings of International Conference on Image Processing.

[94]  Yao Wang,et al.  Active mesh-a feature seeking and tracking image sequence representation scheme , 1994, IEEE Trans. Image Process..

[95]  A. Murat Tekalp,et al.  Occlusion-adaptive, content-based mesh design and forward tracking , 1997, IEEE Trans. Image Process..

[96]  Jerome M. Shapiro,et al.  Embedded image coding using zerotrees of wavelet coefficients , 1993, IEEE Trans. Signal Process..

[97]  Yin-Tsung Hwang,et al.  Efficient algorithm and architecture designs for MPEG-4 shape adaptive video object coding , 2001, 2001 IEEE Workshop on Signal Processing Systems. SiPS 2001. Design and Implementation (Cat. No.01TH8578).

[98]  A. Murat Tekalp,et al.  A hybrid video codec with block-based and mesh-based motion compensation modes , 1998, Int. J. Imaging Syst. Technol..

[99]  Alfred Mertins Optimized biorthogonal shape adaptive wavelets , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[100]  Hiroshi Harashima,et al.  Iterative motion estimation method using triangular patches for motion compensation , 1991, Other Conferences.