Fast Text Compression with Neural Networks

Neural networks have the potential to extend data compression algorithms beyond the character level n-gram models now in use, but have usually been avoided because they are too slow to be practical. We introduce a model that produces better compression than popular Limpel-Ziv compressors (zip, gzip, compress), and is competitive in time, space, and compression ratio with PPM and BurrowsWheeler algorithms, currently the best known. The compressor, a bit-level predictive arithmetic encoder using a 2 layer, 4 × 10 6 by 1 network, is fast (about 10 4 characters/second) because only 4-5 connections are simultaneously active and because it uses a variable learning rate optimized for one-pass training.