Deep Feature Compression Using Spatio-Temporal Arrangement Toward Collaborative Intelligent World

Collaborative Intelligence is a new paradigm that splits a deep neural network (DNN) into an edge and cloud for deploying a DNN-based image recognition application. In this paradigm, deep features, which are the outputs of the edge DNN, are compressed and transmitted to the cloud DNN. Because the deep features have a number of responses that are similar to each other, for efficient compression, previous methods spatially arrange and compress the deep features as an image to utilize the similarity as a spatial correlation. However, if the deep features are arranged in not only spatial but also temporal directions like those in a video, it may be possible to compress them more efficiently by increasing a temporal correlation. To explore this possibility, we propose a “spatio-temporal arrangement”. This method spatially arranges the deep features as images and temporally arranges them as a video with a novel ordering search algorithm. Our method effectively increases the spatial and temporal correlations hidden in the deep features and achieves high compression efficiency compared with the previous methods. Experimental results demonstrate the compression efficiency of our method is better than that of the previous methods (1.50% to 4.98% on BD-Rate evaluation in a lossy setting). Our analysis shows that our method effectively increases the correlation when the input is an image with rich edges and textures.