A Data-Mining-Based Prefetching Approach to Caching for Network Storage Systems

The need for network storage has been increasing rapidly owing to the widespread use of the Internet in organizations and the shortage of local storage space due to the increasing size of applications and databases. Proliferation of network storage systems entails a significant increase in the number of storage objects (e.g., files) stored, the number of concurrent clients, and the size and number of storage objects transferred between the systems and their clients. Performance (e.g., client-perceived latency) of these systems becomes a major concern. Previous research has explored techniques for scaling up the number of storage servers involved to enhance the performance of network storage systems. However, adding servers to improve system performance is an expensive solution. Moreover, for a WAN-based network storage system, the bottleneck for its performance improvement typically is not caused by the load of storage servers, but by the network traffic between clients and storage servers. This paper introduces an Internet-based network storage system named NetShark and proposes a caching-based performance-enhancement solution for such a system. The proposed performance-enhancement solution is validated using a simulation.

[1]  Darrell D. E. Long,et al.  Exploring the Bounds of Web Latency Reduction from Caching and Prefetching , 1997, USENIX Symposium on Internet Technologies and Systems.

[2]  Michael Dahlin,et al.  Engineering web cache consistency , 2002, TOIT.

[3]  G. Barish,et al.  World Wide Web caching: trends and techniques , 2000, IEEE Commun. Mag..

[4]  Averill M. Law,et al.  Simulation Modeling and Analysis , 1982 .

[5]  Brian D. Davison Predicting web actions from HTML content , 2002, HYPERTEXT '02.

[6]  Olivia R. Liu Sheng,et al.  Analysis of Optimal File Migration Policies in Distributed Computer Systems , 1992 .

[7]  Paul Barford,et al.  The network effects of prefetching , 1998, Proceedings. IEEE INFOCOM '98, the Conference on Computer Communications. Seventeenth Annual Joint Conference of the IEEE Computer and Communications Societies. Gateway to the 21st Century (Cat. No.98.

[8]  Wei Lin,et al.  Web prefetching between low-bandwidth clients and proxies: potential and performance , 1999, SIGMETRICS '99.

[9]  Anja Feldmann,et al.  Performance of Web proxy caching in heterogeneous bandwidth environments , 1999, IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320).

[10]  Sandy Irani,et al.  Cost-Aware WWW Proxy Caching Algorithms , 1997, USENIX Symposium on Internet Technologies and Systems.

[11]  Chengjie Liu,et al.  Maintaining strong cache consistency in the World-Wide Web , 1997, Proceedings of 17th International Conference on Distributed Computing Systems.

[12]  Margo I. Seltzer,et al.  The case for geographical push-caching , 1995, Proceedings 5th Workshop on Hot Topics in Operating Systems (HotOS-V).

[13]  Li Fan,et al.  Web caching and Zipf-like distributions: evidence and implications , 1999, IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320).

[14]  Hsing Mei,et al.  Hybrid prefetching for WWW proxy servers , 1998, Proceedings 1998 International Conference on Parallel and Distributed Systems (Cat. No.98TB100250).

[15]  Sook-Hyang Kim,et al.  A Statistical , Batch , Proxy-Side Web Prefetching Scheme for Efficient Internet Bandwidth Usage , 2000 .

[16]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[17]  Sandy Irani,et al.  GreedyDual-Size: A Cost-Aware WWW Proxy Caching Algorithm , 1997 .

[18]  K. Chinen,et al.  An Interactive Prefetching Proxy Server for Improvement of WWW Latency , 1997 .

[19]  Brian D. Davison NCS: Network and Cache Simulator -- An Introduction , 2001 .

[20]  Johannes Gehrke,et al.  DEMON: Mining and Monitoring Evolving Data , 2001, IEEE Trans. Knowl. Data Eng..

[21]  Chandramohan A. Thekkath,et al.  Petal: distributed virtual disks , 1996, ASPLOS VII.

[22]  Chengjie Liu,et al.  Maintaining Strong Cache Consistency in the World Wide Web , 1998, IEEE Trans. Computers.

[23]  梅村 恭司 Andrew S.Tanenbaum 著, "Operating systems, Design and implementation", PRENTICE-HALL, INC., Englewood Cliffs, B5変形判, 719p., \4,120 , 1988 .

[24]  Jeffrey C. Mogul,et al.  Using predictive prefetching to improve World Wide Web latency , 1996, CCRV.

[25]  Andrew S. Tanenbaum,et al.  Operating systems: design and implementation , 1987, Prentice-Hall software series.

[26]  Evangelos P. Markatos,et al.  A top- 10 approach to prefetching on the web , 1996 .

[27]  Azer Bestavros,et al.  Popularity-aware greedy dual-size Web proxy caching algorithms , 2000, Proceedings 20th IEEE International Conference on Distributed Computing Systems.

[28]  Rodney Van Meter,et al.  Network attached storage architecture , 2000, CACM.

[29]  Sally Floyd,et al.  Wide-area traffic: the failure of Poisson modeling , 1994 .

[30]  Chandramohan A. Thekkath,et al.  Frangipani: a scalable distributed file system , 1997, SOSP.

[31]  Hongjun Lu,et al.  Efficient prediction of web accesses on a proxy server , 2002, CIKM '02.