Compression Proxy Server: Design and Implementation

Automatic data compression in the web proxy server is an important mechanism that can potentially reduce network bandwidth consumption and web access latency significantly. However, unlike traditional data compression, web protocols and data have unique characteristics that make compression challenging. These include data block streaming, wide range of data object sizes and types, and real-time response. In this paper, we focus on automatic web data compression in the HTTP proxy server. A new classification of web data compression based on system complexity and HTTP requirements is proposed: stream, block and file compression. Then, the concept of hybrid web data compression is introduced. To understand the potentials of web data compression better, an implementation of the proposed hybrid compression in the Squid proxy server is described. The result is very promising, as about 30% of the bandwidth can be saved easily. Furthermore, even with a low end Pentium 266 MHz PC as the proxy machine, the compression overhead is less than 1% of the transfer time.

[1]  G.G. Langdon,et al.  Data compression , 1988, IEEE Potentials.

[2]  Mark Nelson,et al.  The Data Compression Book , 2009 .

[3]  Fred Douglis On the role of compression in distributed systems , 1992, EW 5.

[4]  Butler W. Lampson,et al.  On-line data compression in a log-structured file system , 1992, ASPLOS V.

[5]  Fred Douglis,et al.  The Compression Cache: Using On-line Compression to Extend Physical Memory , 1993, USENIX Winter.

[6]  Mark Nelson,et al.  The data compression book (2nd ed.) , 1995 .

[7]  Jeffrey C. Mogul,et al.  Improving HTTP Latency , 1995, Comput. Networks ISDN Syst..

[8]  Edward A. Fox,et al.  Caching Proxies: Limitations and Potentials , 1995, WWW.

[9]  Edward A. Fox,et al.  Removal policies in network caches for World-Wide Web documents , 1996, SIGCOMM '96.

[10]  Eric A. Brewer,et al.  Adapting to network and client variability via on-demand dynamic distillation , 1996, ASPLOS VII.

[11]  Peter Deutsch,et al.  ZLIB Compressed Data Format Specification version 3.3 , 1996, RFC.

[12]  Barron C. Housel,et al.  WebExpress: a system for optimizing Web browsing in a wireless environment , 1996, MobiCom '96.

[13]  Peter B. Danzig,et al.  A Hierarchical Internet Object Cache , 1996, USENIX ATC.

[14]  Eric A. Brewer,et al.  Reducing WWW Latency and Bandwidth Requirements by Real-Time Distillation , 1996, Comput. Networks.

[15]  Edward A. Fox,et al.  Removal Policies in Network Caches for World-Wide Web Documents , 1996, SIGCOMM.

[16]  Jeffrey C. Mogul,et al.  Using predictive prefetching to improve World Wide Web latency , 1996, CCRV.

[17]  Peter Deutsch,et al.  DEFLATE Compressed Data Format Specification version 1.3 , 1996, RFC.

[18]  Walter F. Tichy,et al.  An Empirical Study of Delta Algorithms , 1996, SCM.

[19]  James Gettys,et al.  Network performance effects of HTTP/1.1, CSS1, and PNG , 1997, SIGCOMM '97.

[20]  Fred Douglis,et al.  Optimistic deltas for WWW latency reduction , 1997 .

[21]  Anja Feldmann,et al.  Potential benefits of delta encoding and data compression for HTTP , 1997, SIGCOMM '97.

[22]  David Salomon,et al.  Data Compression: The Complete Reference , 2006 .

[23]  David Wetherall,et al.  Increasing Effective Link Bandwidth by Supressing Replicated Data , 1998, USENIX Annual Technical Conference.