Motion estimation and encoding algorithms for hierarchical representation of digital video

Future networks will provide a common platform for transport of a variety of services, including voice, data and video, in an integrated fashion. The emerging video applications such as the digital high definition television (HDTV), and others involving high resolution images/graphics, will potentially generate very high bit rates (in tens of Mega bits per second) to be transported over these networks. With the emerging concept of open-architecture television system, which defines a scalable, flexible, and hierarchical representation of video, it is believed that the next generation of television system will have more degrees of freedom in addition to just line resolution. Although fiber-optic links can provide the bandwidth for transmission of these signals without any form of compression, multiplexing of a number of such bit streams, will require unusually high bandwidth. Another aspect is the storage requirements for the data generated by these applications which will be tremendous in terms of disk space. Thus, some form of data reduction or encoding will always be required to enable storage, processing and even transmission of such data. Motion compensation or displacement estimation, which intends to obtain the knowledge about the path and speed of moving objects in a video scene, has been widely applied to various traditional interframe coding schemes such as discrete cosine transform (DCT), differential pulse code modulation (DPCM), vector quantization, etc. More recently, hierarchical coding schemes like subband coding and pyramid representation techniques such as Wavelet decomposition have been introduced. These techniques use a global decomposition of the image rather than working on small blocks/segments at a time. Thus, they have a much improved subjective performance because they lack the "blocky" artifacts intrinsic to traditional small block transformation procedures. In this dissertation, new algorithms for estimation of motion are presented based on an autoregressive (AR) prediction model. The scheme not only has a performance close to the optimal full-search algorithm, but has a much reduced computational and search complexity. Hierarchical representation of video signal, e.g. wavelet decomposition, provides an alternative to small-block transformed-based schemes. The motion field at different levels of the hierarchy are highly correlated and thus the concept of prediction is extended to multiresolution motion estimation. Various scenarios in multiresolution motion estimation are discussed and evaluated for implementation complexity. Both subjective and objective performance criterion for evaluation of coding results are discussed and based on these, results of improvements of our technique over the existing and other proposed schemes displayed.