Trace Analysis and its Applications to Performance Enhancements of Distributed Information Systems

This work is dedicated to my father, mother, and my dearest friends. v This page intentionally left blank. vi Acknowledgments I am deeply grateful for the guidance and support of my advisor Professor Azer Bestavros. Throughout my stay at Boston University, Azer's insightful comments and advice provided me with diierent views expanding my horizons, at a diicult moment of topic deenition, and the enriching discussions with Azer led to this work. I thank Azer for his support and for his friendship. I am thankful to Carlos Felipe de Brito Jaccoud. Carlos introduced me to the methods of Digital Signal Processing and helped me with the tools necessary for the development of the DSP model in Chapter 3. We exchanged lots of e-mail messages clearing points which were not clear to us: on my side, DSP, and on his side, points relative to user behavior and the model I was working on. I thank the three readers of my dissertation, Azer Bestavros, Mark Crovella and Abdelsalam Heddaya, whose patience, comments and suggestions have contributed signiicant improvement in my dissertation. I would also like to thank the other members of my committee, Virg lio Almeida and Wayne Snyder, for giving me their votes of conndence. Also, Steven Homer who participated in the proposal process and reviewed the algorithm analysis in Chapter 5. I am very grateful for the generous nancial support I received while preparing my dissertation. This support was provided for by the Brazilian National Council for Research and Development|CNPq|in the form of a scholarship in the rst four years of the program, together with the help of the people here in the Computer Science Department at Boston University who allowed me to teach and hold a teaching fellow in the last year of the program. I would like to thank Lou Hennessy, the former department system administrator who greatly helped me with many useful tips on the system, and who provided me with the opportunity to work as a student system administrator to complement my learning and my nancial support. I also thank the Brazilian Telecommunications Company|Embratel|for having vii granted me a leave of absence, without which I would not have had the peace of mind to complete this arduous task. I thank the many other people of the Computer Science Department of Boston University who helped provide a pleasant atmosphere, in particular Joseph Malloy (who followed my process since …

[1]  Azer Bestavros Demand-based Document Dissemination for the World-Wide Web , 1995 .

[2]  Philip L. Rosenfeld,et al.  Fractal Nature of Software-Cache Interaction , 1983, IBM J. Res. Dev..

[3]  Thomas Alexander,et al.  Design and Evaluation of a Distributed Cache Architecture with Prediction , 1994 .

[4]  Paul V. Mockapetris,et al.  Domain names - implementation and specification , 1987, RFC.

[5]  Larry Wall,et al.  Programming Perl , 1991 .

[6]  Mark Day,et al.  Lockup of a client object cache and how to avoid it , 1993, Proceedings Third International Workshop on Object Orientation in Operating Systems.

[7]  Alan Jay Smith,et al.  Branch Prediction Strategies and Branch Target Buffer Design , 1995, Computer.

[8]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.

[9]  M.G. Bellanger,et al.  Digital processing of speech signals , 1980, Proceedings of the IEEE.

[10]  Mark Day Object groups may be better than pages , 1993, Proceedings of IEEE 4th Workshop on Workstation Operating Systems. WWOS-III.

[11]  Gary Scott Malkin,et al.  Traceroute Using an IP Option , 1993, RFC.

[12]  Silvano Maffeis Cache management algorithms for flexible filesystems , 1993, PERV.

[13]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[14]  P. Venkat Rangan,et al.  Designing file systems for digital video and audio , 1991, SOSP '91.

[15]  Abrams Marc,et al.  Scaling the World-Wide Web , 1996 .

[16]  Charles K. Nicholas,et al.  Reliability of WWW Name Servers , 1995, Comput. Networks ISDN Syst..

[17]  Benjamin W. Wah File Placement on Distributed Computer Systems , 1984, Computer.

[18]  Azer Bestavros,et al.  Application-level document caching in the Internet , 1995, Second International Workshop on Services in Distributed and Networked Environments.

[19]  Mark Crovella,et al.  Characteristics of WWW Client-based Traces , 1995 .

[20]  Abdelsalam Heddaya,et al.  WebWave: globally load balanced fully distributed caching of hot published documents , 1997, Proceedings of 17th International Conference on Distributed Computing Systems.

[21]  Ari Luotonen,et al.  World-Wide Web Proxies , 1994, Comput. Networks ISDN Syst..

[22]  Ray Jain,et al.  The art of computer systems performance analysis - techniques for experimental design, measurement, simulation, and modeling , 1991, Wiley professional computing.

[23]  Azer Bestavros,et al.  Using speculation to reduce server load and service time on the WWW , 1995, CIKM '95.

[24]  E PitkowJames,et al.  Characterizing browsing strategies in the World-Wide Web , 1995 .

[25]  Virgílio A. F. Almeida,et al.  On the Fractal Nature of WWW and Its Application to Cache Modeling , 1996 .

[26]  Abraham Silberschatz,et al.  Distributed file systems: concepts and examples , 1990, CSUR.

[27]  Dominique Thiebaut On the Fractal Dimension of Computer Programs and its Application to the Prediction of the Cache Miss Ratio , 1990, PERV.

[28]  John L. Romkey,et al.  Nonstandard for transmission of IP datagrams over serial lines: SLIP , 1988, RFC.

[29]  Azer Bestavros,et al.  Demand-based document dissemination to reduce traffic and balance load in distributed information systems , 1995, Proceedings.Seventh IEEE Symposium on Parallel and Distributed Processing.

[30]  Stanley B. Zdonik,et al.  Predictive Caching , 1990 .

[31]  J. D. Day,et al.  A principle for resilient sharing of distributed resources , 1976, ICSE '76.

[32]  Virgílio A. F. Almeida,et al.  Characterizing reference locality in the WWW , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[33]  James E. Pitkow,et al.  Yet Robust Caching Algorithm Based on Dynamic Access Patterns , 1994, WWW Spring 1994.

[34]  Garret Swart,et al.  Granularity and semantic level of replication in the Echo distributed file system , 1990, [1990] Proceedings. Workshop on the Management of Replicated Data.

[35]  Roy T. Fielding,et al.  Hypertext Transfer Protocol - HTTP/1.0 , 1996, RFC.

[36]  Lawrence W. Dowdy,et al.  Comparative Models of the File Assignment Problem , 1982, CSUR.

[37]  Paul V. Mockapetris,et al.  DNS encoding of network names and other types , 1989, RFC.

[38]  Michael Dahlin,et al.  A quantitative analysis of cache policies for scalable network file systems , 1994, SIGMETRICS.

[39]  James Gwertzman,et al.  Autonomous Replication in Wide-Area Internetworks , 1995 .

[40]  Paul V. Mockapetris,et al.  Domain names: Concepts and facilities , 1983, RFC.

[41]  John S. Heidemann,et al.  Replication in Ficus distributed file systems , 1990, [1990] Proceedings. Workshop on the Management of Replicated Data.

[42]  Christian Huitema,et al.  Routing in the Internet , 1995 .

[43]  Russell Beale,et al.  Neural networks and pattern recognition in human-computer interaction , 1993, SGCH.

[44]  Garret Swart,et al.  Availability in the Echo File System , 1996 .

[45]  Maurice P. Marchant Dictionary of Business and Economics , 1977 .

[46]  Jon Postel,et al.  Internet Control Message Protocol , 1981, RFC.

[47]  Shuang Deng,et al.  Empirical model of WWW document arrivals at access link , 1996, Proceedings of ICC/SUPERCOMM '96 - International Conference on Communications.

[48]  Lee W. Hoevel,et al.  The Software-Cache Connection , 1981, IBM J. Res. Dev..

[49]  Mike St. Johns Authentication server , 1985, RFC.

[50]  Martin F. Arlitt,et al.  Web server workload characterization: the search for invariants , 1996, SIGMETRICS '96.

[51]  John S. Heidemann,et al.  Resolving File Conflicts in the Ficus File System , 1994, USENIX Summer.

[52]  John S. Heidemann,et al.  Implementation of the Ficus Replicated File System , 1990, USENIX Summer.

[53]  Daniel A. Reed,et al.  Real-Time Geographic Visualization of World Wide Web Traffic , 1996, Comput. Networks.

[54]  Michelle Butler,et al.  A Scalable HTTP Server: The NCSA Prototype , 1994, Comput. Networks ISDN Syst..

[55]  Venkata N. Padmanabhan Improving World Wide Web Latency , 1995 .

[56]  David A. Goldberg,et al.  Design and Implementation of the Sun Network Filesystem , 1985, USENIX Conference Proceedings.

[57]  Jeffrey C. Mogul,et al.  The case for persistent-connection HTTP , 1995, SIGCOMM '95.

[58]  James E. Donnelley,et al.  WWW Media Distribution via Hopwise Reliable Multicast , 1995, Comput. Networks ISDN Syst..

[59]  Michael Williams,et al.  Replication in the harp file system , 1991, SOSP '91.

[60]  J.A. Anderson,et al.  Theory of categorization based on distributed memory storage. , 1984 .

[61]  David Eichmann,et al.  2 – Background : Agents in General and Spiders in Particular , 1994 .

[62]  Jacob R. Lorch,et al.  Making World Wide Web Caching Servers Cooperate , 1996, World Wide Web J..

[63]  P. Krishnan Online prediction algorithms for databases and operating systems , 1996 .

[64]  A. Retrospective,et al.  The UNIX Time-sharing System , 1977 .

[65]  William Allen Simpson,et al.  The Point-to-Point Protocol (PPP) , 1993, RFC.

[66]  Abraham Lempel,et al.  Compression of individual sequences via variable-rate coding , 1978, IEEE Trans. Inf. Theory.

[67]  Richard S. Hall,et al.  A Measurement Study of Internet File Transfer Traffic ; CU-CS-571-92 , 1992 .

[68]  Peter Sturm,et al.  Introducing Application-Level Replication and Naming into Today's Web , 1996, Comput. Networks.

[69]  V. Paxson End-to-end routing behavior in the internet , 2006, CCRV.

[70]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[71]  J. Howard Et El,et al.  Scale and performance in a distributed file system , 1988 .

[72]  Ii Richard George Guy,et al.  FICUS: a very large scale reliable distributed file system , 1992 .

[73]  Mahadev Satyanarayanan,et al.  Coda: A Highly Available File System for a Distributed Workstation Environment , 1990, IEEE Trans. Computers.

[74]  Bilal Chinoy,et al.  Dynamics of internet routing information , 1993, SIGCOMM '93.

[75]  Azer Bestavros,et al.  Explaining World Wide Web Traffic Self-Similarity , 1995 .

[76]  Gerald J. Popek,et al.  Algorithms for Consistency in Optimistically Replicated File Systems , 1991 .

[77]  James E. Pitkow,et al.  Characterizing Browsing Strategies in the World-Wide Web , 1995, Comput. Networks ISDN Syst..

[78]  Edward A. Fox,et al.  Caching Proxies: Limitations and Potentials , 1995, WWW.

[79]  Peter B. Danzig,et al.  An analysis of wide-area name server traffic: a study of the Internet Domain Name System , 1992, SIGCOMM 1992.

[80]  Richard S. Hall,et al.  A case for caching file objects inside internetworks , 1993, SIGCOMM '93.

[81]  Rahul Simha,et al.  A Microeconomic Approach to Optimal File Allocation , 1986, ICDCS.

[82]  Benoit B. Mandelbrot,et al.  Fractal Geometry of Nature , 1984 .

[83]  W. M. Carey,et al.  Digital spectral analysis: with applications , 1986 .

[84]  Evangelos P. Markatos,et al.  Main Memory Caching of Web Documents , 1996, Comput. Networks.

[85]  Steven Glassman,et al.  A Caching Relay for the World Wide Web , 1994, Comput. Networks ISDN Syst..

[86]  Jean-Chrysostome Bolot,et al.  Performance Engineering of the World Wide Web: Application to Dimensioning and Cache Design , 1996, Comput. Networks.

[87]  Kurt Jeffery Worrell Invalidation in Large Scale Network Object Caches , 1994 .

[88]  Mark Crovella,et al.  Dynamic Server Selection using Bandwidth Probing in Wide-Area Networks , 1996 .

[89]  William Cheswick,et al.  Firewalls and Internet Security , 1994 .

[90]  Michel Gien,et al.  A File Transfer Protocol (FTP) , 1978, Comput. Networks.

[91]  Michael Dahlin,et al.  Cooperative caching: using remote client memory to improve file system performance , 1994, OSDI '94.

[92]  Liuba Shrira,et al.  Distributed Object Management in Thor , 1992, IWDOM.

[93]  K. Kavi Cache Memories Cache Memories in Uniprocessors. Reading versus Writing. Improving Performance , 2022 .

[94]  Karen R. Sollins,et al.  Functional Requirements for Uniform Resource Names , 1994, RFC.

[95]  Peter J. Denning,et al.  On modeling program behavior , 1972, AFIPS '72 (Spring).

[96]  Peter B. Danzig,et al.  A Hierarchical Internet Object Cache , 1996, USENIX ATC.

[97]  Stanley B. Zdonik,et al.  Fido: A Cache That Learns to Fetch , 1991, VLDB.

[98]  Garret Swart,et al.  The Echo Distributed File System , 1996 .

[99]  A. López-Ortiz,et al.  A Multicollaborative Push-Caching HTTP Protocol for the WWW , 1995 .

[100]  William Allen Simpson,et al.  The Point-to-Point Protocol (PPP) , 1993, RFC.

[101]  Margo Seltzer,et al.  VINO: The 1994 Fall Harvest , 1994 .

[102]  Garret Swart,et al.  New-value Logging in the Echo Replicated File System , 1996 .

[103]  Mark Hahn,et al.  Uniform Resource Locators , 1995 .