论文信息 - Storing and Compressing Video into Neural Networks by Overfitting

Storing and Compressing Video into Neural Networks by Overfitting

We propose a video compression technique based on neural networks which is completely different from existing techniques. Existing video compression techniques have used a neural network as a video compressor. In other words, video compression flow of the existing techniques was: an original video is inputted to a neural network, and the neural network outputs compressed expression of the video. In this paper, we propose a technique for storing a video into a neural network. In this technique, the neural network is treated as a compressed expression of the video. Our purpose is to find the configuration of the neural network that stores and compresses the video efficiently. We implement two models of the neural network. One is a multilayer perceptron whose input is t, and output is \(frame_t\). The other is the Long short-term memory (LSTM) whose inputs are past frames like \(frame_{t - 2}\) and \(frame_{t - 1}\), and output is predicted \(frame_t\). If a data amount required to express the neural network is smaller than the video, it can be regarded as a compressed video. As the result of experiments, it is shown that the data amount of the LSTM implementation was larger than the original video. However, that of the multilayer perceptron one was about a half of the original video, and value of the structural similarity (SSIM), which is a metric of image quality, was about 0.99.

Yuichiro Shibata | Hiroki Egawa

[1] Sachin S. Talathi,et al. Fixed Point Quantization of Deep Convolutional Networks , 2015, ICML.

[2] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[3] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[4] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[5] David Minnen,et al. Full Resolution Image Compression with Recurrent Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Eero P. Simoncelli,et al. Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[7] Yoshua Bengio,et al. BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1 , 2016, ArXiv.