Compressed Video Sensing

Recently, the notions of Compressed Sensing and Compressive Sampling have attracted attention as an innovative concept in signal processing. Compressed sensing proposes that, when dealing with signals which are highly compressible in a known basis, for example in a wavelet basis, one can dispense with traditional sampling and instead take a small number of samples which are functionals of the whole data stream. The signal is reconstructed by solving a linear program, yielding an object having basis coefficients with minimal ` norm. In this work we consider video streams and reconstruct volumes from a subset of time series, and show that they are highly compressible in a wavelet basis. We conclude that compressed sensing is highly appropriate for representing video streams. The concept is new, because it is saying that one can integrate the sensing and compression in a single step. For example, compression is achieved without any analysis of the video source. To implement Compressed Video Sensing, each pixel time series is encoded by multiplication with a random matrix having many fewer rows than columns, giving compression combined with encryption. The encoding involves only simple arithmetic, and the compression achieved is within a reasonable multiple of the best which would be possible by computationallyintensive means. The encoded data is secure when the encoding matrix is used as a one-time pad. Moreover, the encoded data have a built-in error-correction feature, in that a small fraction of corrupted samples do not upset the reconstruction process. In hybrid CS schemes, different strategies are used for the coarse-scale and fine-scale signal content. We apply these notions to video by storing a conventionally-sampled low resolution temporal stream, and in addition a CS representation of the full-temporal resolution. The stream can then be reconstructed cheaply when viewed at low-resolution, for example in routine use, yet high-resolution segments can be made available on demand. We describe a method for early-stage video processing which allows to inexpensively compress and encrypt video data. The video stream is multiplied by a non-square random matrix, effectively scrambling it so that it is encrypted, and reducing its dimensionality so that it is compressed. The encryption matrix can be viewed as a one-time pad that is completely secure and the compression effect comes within a reasonable factor of the best possible compression available using much more sophisticated processing. Such an encryption algorithm requires a huge amount of pad, but the pad (i.e. random matrix) does not have to be transmitted; the sender and receiver can know this in advance. So there’s no significant burden caused by this. Essentially we are combining source coding and encryption simultaneously. It is true that we cannot compress below the entropy of the source, but the point we are making is that video stream is well above the source rate. So the CS stream is highly compressed. At the same time there is an automatic error-correction effect: the encrypted compressed stream is immune to occasional erasures, corruption and packet loss. The scheme has the general character that it is inexpensive at the recording/capture stage and expensive at the playback/deencrypt/reconstruct stage. While the capture and encoding of the data are simple, high-resolution reconstruction involves convex optimization. The approach we describe has a variety of applications. On the one hand, it could be useful for sensor networks in which low power devices need to cheaply capture and send data at a low rate, while being immune to spying and to error bursts in transmission. On the other hand, it is useful for very variable data-rate cameras, creating video streams which typically are watched at regular speed but contain an embedded stream allowing reconstruction of a much finer time-resolution sequence on demand. The mathematical foundation of our proposal is the abstract notion of Compressed Sensing [2, 1, 4]. In this approach, one captures measurements not of individual samples, but of general linear combinations of samples which are seemingly ’random’; each measurement combines data from samples widely distributed across the data stream. Moreover, the number of measurements is smaller than the number of samples, and is comparable to but somewhat larger than the minimal number of measurements needed to characterize the signal. The driving idea is that the signal should be compressible when represented in a fixed basis such as the wavelet basis, with a relatively few large coefficients. Mathematical analysis has shown that if we take more ‘random’ measurements than the number of significant basis coefficients, the object whose coefficients have the smallest ` norm is a good approximation to the underlying object, despite the apparent under-sampling. In this work we study single-pixel video streams of surveillance-style imagery and show that they consist of slow smooth trends combined with occasional abrupt spikes. As a result they are compressible in the wavelet basis, with a few coarse-scale wavelet coefficients representing the smooth trends and a few fine-scale wavelet coefficients representing the abrupt changes. This key finding shows that video signals meet the requirements for applicability of the theory of compressed sensing. We describe three deployments of compressed sensing in the video framework. For example, in one attractive hybrid scheme, we record coarse-scale measurements of wavelet coefficients and use compressed sensing only at the fine scales to recover the abrupt changes. We demonstrate simultaneous 4:1 lossless compression, encryption, error-control coding using processes that require only simple matrix multiplication, hence addition and multiplication. Our methods are effective in the presence of noise and camera jitter, as we show, and are stable in the presence of packet loss and data erasure.

[1]  David L Donoho,et al.  Compressed sensing , 2006, IEEE Transactions on Information Theory.

[2]  Iddo Drori,et al.  Fast Minimization by Iterative Thresholding for Multidimensional NMR Spectroscopy , 2007, EURASIP J. Adv. Signal Process..

[3]  Sudipto Guha,et al.  Near-optimal sparse fourier representations via sampling , 2002, STOC '02.

[4]  Emmanuel J. Candès,et al.  Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information , 2004, IEEE Transactions on Information Theory.