WAN OPTIMIZATION BY APPLICATION INDEPENDENT DATA REDUNDANCY ELIMINATION (BY CACHING)

Huge amount of data flows through internet today, the data traffic over the internet is more than 25,000 GB per second. This leads to problems such as congestion, latency, low throughput etc. So for a smooth user experience the Wide Area Network (WAN) should be optimized. Many techniques called WAN optimization techniques are available in order to increase data-flow efficiencies across wan. Main wan optimization techniques include deduplication, caching, latency optimization, traffic shaping, protocol spoofing etc. This article mainly discusses the deduplication technique for wan optimization. Deduplication eliminates the transfer of redundant data across the network, it does so by sending a reference to data and not sending the data itself. Deduplication is also called as Data Redundancy Elimination or DRE. DRE can be implemented either at packet level where we search and eliminate redundant packets or at byte level where we search for repeated byte string and replace it with a shorter reference. Surveys have shown that the DRE has ability to reduce the data traffic by 15% to 60%. This article discusses major DRE techniques which are available and challenges for implementing DRE algorithms.

[1]  Carey L. Williamson,et al.  DYNABYTE: A Dynamic Sampling Algorithm for Redundant Content Detection , 2011, 2011 Proceedings of 20th International Conference on Computer Communications and Networks (ICCCN).

[2]  H. Liu,et al.  Conference on Measurement and modeling of computer systems , 2001 .

[3]  Alberto Leon-Garcia,et al.  A Distributed Ethernet Traffic Shaping system , 2010, 2010 17th IEEE Workshop on Local & Metropolitan Area Networks (LANMAN).

[4]  Robert B. Ross,et al.  Improving I/O Forwarding Throughput with Data Compression , 2011, 2011 IEEE International Conference on Cluster Computing.

[5]  Daniel Shawcross Wilkerson,et al.  Winnowing: local algorithms for document fingerprinting , 2003, SIGMOD '03.

[6]  A. Mahanti Internet Traffic Measurement , 2005 .

[7]  David Hung-Chang Du,et al.  Frequency Based Chunking for Data De-Duplication , 2010, 2010 IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[8]  Marco Mellia,et al.  A Measurement-Centered Approach to Latency Reduction , 2013 .

[9]  Carey L. Williamson,et al.  Low-Overhead Dynamic Sampling for Redundant Traffic Elimination , 2012, J. Commun..

[10]  Yan Zhang,et al.  On Protocol-Independent Data Redundancy Elimination , 2014, IEEE Communications Surveys & Tutorials.

[11]  Sumanta Saha,et al.  CombiHeader: Minimizing the number of shim headers in redundancy elimination systems , 2011, 2011 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[12]  Petros Efstathopoulos,et al.  Building a High-performance Deduplication System , 2011, USENIX Annual Technical Conference.

[13]  Aditya Akella,et al.  Redundancy in network traffic: findings and implications , 2009, SIGMETRICS '09.

[14]  Siti Mariyam Shamsuddin,et al.  A Survey of Web Caching and Prefetching , 2011 .

[16]  Petros Efstathopoulos,et al.  Rethinking Deduplication Scalability , 2010, HotStorage.

[17]  Mingquan Wu,et al.  On Wide Area Network Optimization , 2012, IEEE Communications Surveys & Tutorials.

[18]  George Varghese,et al.  EndRE: An End-System Redundancy Elimination Service for Enterprises , 2010, NSDI.