Compressing Deep Neural Networks with Probabilistic Data Structures

This paper presents a lossy weight encoding method which complements conventional compression techniques including weight pruning and clustering. The encoding is based on the Bloomier filter, a probabilistic data structure that can save space at the expense of introducing random errors in the weights. Leveraging the ability of DNNs to tolerate these imperfections and by re-training around them, the proposed technique can compress DNN weights by up to 496× (a 1.51× improvement over the state-of-the-art) without sacrificing model accuracy.