Date Date Date Date Date

The increasing ability to interconnect computers through internetworking, wireless networks, high-bandwidth satellite, and cable networks has spawned a new class of information-centered applications based on data dissemination. These applications, which often serve huge client populations, employ broadcast to efficiently deliver data to the clients. In data dissemination, the data transfer is initiated by the server, inverting the traditional relationship between the client and the server. This thesis proposes a novel “multi-disk” framework for data dissemination called Broadcast Disks. The Broadcast Disks approach significantly improves upon the previous work in dissemination-based systems and raises a number of fundamentally new research challenges. In this thesis, we first motivate why the rise of asymmetric environments (i.e., networks which have a significantly higher bandwidth available from servers to clients than in the reverse direction) and the scale of the emerging distributed information systems is causing a shift from the traditional pull-based client-server model to a push-based dissemination model. Then, the Broadcast Disks model is introduced and explored using a simulation-based study and a working implementation. The bulk of this thesis uses simulation results to understand the basic tradeoffs in a dissemination framework. We demonstrate the performance benefits of the Broadcast Disks model over the traditional approach to structuring a dissemination program. We also introduce a new cost-based approach to client cache management and develop efficient algorithms for prefetching and dissemination of updates. Then, the thesis addresses the issues that arise in supporting clients using both push and pull-based data delivery. Finally, we describe our experience in building and testing a Broadcast Disks prototype. While the studies using the prototype validate the simulation-based intuitions, they also raise many new issues and highlight some of the shortcomings of the current technology for building push-based systems. The underlying theme driving the studies in this thesis is to develop techniques to improve system performance and scalability which adapt to the new tradeoffs in the emerging computing landscape.

[1]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[2]  E. F. Codd,et al.  A Relational Model for Large Shared Data Banks , 1970 .

[3]  Mostafa H. Ammar,et al.  The Design of Teletext Broadcast Cycles , 1985, Perform. Evaluation.

[4]  Mostafa H. Ammar,et al.  Analysis of Broadcast Delivery in a Videotex System , 1985, IEEE Transactions on Computers.

[5]  David K. Gifford,et al.  An Architecture for Large Scale Information Systems , 1985, SOSP.

[6]  Mostafa H. Ammar,et al.  On the optimality of cyclic transmission in teletext systems , 1985, 1985 24th IEEE Conference on Decision and Control.

[7]  Herb Schwetman,et al.  CSIM: a C-based process-oriented simulation language , 1986, WSC '86.

[8]  James K. Archibald,et al.  Cache coherence protocols: evaluation using a multiprocessor simulation model , 1986, TOCS.

[9]  Mostafa H. Ammar Response Time in a Teletext System: An Individual User's Perspective , 1987, IEEE Trans. Commun..

[10]  Gita Gopal,et al.  The datacycle architecture for very high throughput database systems , 1987, SIGMOD '87.

[11]  John H. Howard,et al.  On Overview of the Andrew File System , 1988, USENIX Winter.

[12]  David K. Gifford,et al.  Polychannel systems for mass digital communication , 1988 .

[13]  J. Wong,et al.  Broadcast Delivery , 1988, Proc. IEEE.

[14]  Miron Livny,et al.  Condor-a hunter of idle workstations , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[15]  Rafael Alonso,et al.  Data caching issues in an information retrieval system , 1990, TODS.

[16]  J. T. Robinson,et al.  Data cache management using frequency-based replacement , 1990, SIGMETRICS '90.

[17]  Philip S. Yu,et al.  The Effect of Skewed Data Access on Buffer Hits and Data Contention an a Data Sharing Environment , 1990, VLDB.

[18]  Abraham Silberschatz,et al.  Distributed file systems: concepts and examples , 1990, CSUR.

[19]  Kevin Wilkinson,et al.  Maintaining Consistency of Client-Cached Data , 1990, VLDB.

[20]  Abraham Silberschatz,et al.  Operating System Concepts, 3rd Edition , 1991 .

[21]  Nick Roussopoulos,et al.  Modern client-server DBMS architectures , 1991, SGMD.

[22]  Bill Nitzberg,et al.  Distributed shared memory: a survey of issues and algorithms , 1991, Computer.

[23]  Carla Schlatter Ellis,et al.  Practical prefetching techniques for parallel file systems , 1991, [1991] Proceedings of the First International Conference on Parallel and Distributed Information Systems.

[24]  Miron Livny,et al.  Data caching tradeoffs in client-server DBMS architectures , 1991, SIGMOD '91.

[25]  K. Salem,et al.  Placing Replicated Data to Reduce Seek Delays Y Placing Replicated Data to Reduce Seek Delays , 1991 .

[26]  Stanley B. Zdonik,et al.  Fido: A Cache That Learns to Fetch , 1991, VLDB.

[27]  C. Pu,et al.  Replica Control in Distributed Systems: An Asynchronous Approach , 1991, SIGMOD Conference.

[28]  Michael J. Carey,et al.  Client-Server Caching Revisited , 1998, IWDOM.

[29]  Nick Roussopoulos,et al.  Performance and Scalability of Client-Server Database Architectures , 1992, VLDB.

[30]  GoldbergDavid,et al.  Continuous queries over append-only databases , 1992 .

[31]  Andreas Reuter,et al.  Transaction Processing: Concepts and Techniques , 1992 .

[32]  Tomasz Imielinski,et al.  Impact of mobility on distributed computations , 1993, OPSR.

[33]  Gerhard Weikum,et al.  The LRU-K page replacement algorithm for database disk buffering , 1993, SIGMOD Conference.

[34]  Peter Honeyman,et al.  Integrating mass storage and file systems , 1993, [1993] Proceedings Twelfth IEEE Symposium on Mass Storage systems.

[35]  Miron Livny,et al.  Local Disk Caching for Client-Server Database Systems , 1993, VLDB.

[36]  P. Krishnan,et al.  Practical prefetching via data compression , 1993 .

[37]  Michael J. Franklin Caching and Memory Management in Client-Server Database Systems , 1993 .

[38]  Dale Skeen,et al.  The Information Bus: an architecture for extensible distributed systems , 1994, SOSP '93.

[39]  Kenneth Baclawski,et al.  Quickly generating billion-record synthetic databases , 1994, SIGMOD '94.

[40]  Tomasz Imielinski,et al.  Power Efficient Filtering of Data an Air , 1994, EDBT.

[41]  Rafael Alonso,et al.  Are “Disks in the Air” Just Pie in the Sky? , 1994, 1994 First Workshop on Mobile Computing Systems and Applications.

[42]  Tomasz Imielinski,et al.  Energy efficient indexing on air , 1994, SIGMOD '94.

[43]  Dennis Shasha,et al.  2Q: A Low Overhead High Performance Buffer Management Replacement Algorithm , 1994, VLDB.

[44]  Shashi Shekhar,et al.  Genesis and Advanced Traveller Information Sys-tems (ATIS): Killer Applications for Mobile Computing , 1994 .

[45]  Mahadev Satyanarayanan,et al.  Disconnected Operation in the Coda File System , 1999, Mobidata.

[46]  Tomasz Imielinski,et al.  Sleepers and workaholics: caching strategies in mobile environments , 1994, SIGMOD '94.

[47]  Tomasz Imielinski,et al.  ADAPTIVE WIRELESS INFORMATION SYSTEMS , 1994 .

[48]  Tomasz Imielinski,et al.  Mobile wireless computing: challenges in data management , 1994, CACM.

[49]  Garth A. Gibson,et al.  Exposing I/O concurrency with informed prefetching , 1994, Proceedings of 3rd International Conference on Parallel and Distributed Information Systems.

[50]  W. Richard Stevens,et al.  TCP/IP Illustrated, Volume 1: The Protocols , 1994 .

[51]  Geoffrey H. Kuenning,et al.  An Analysis of Trace Data for Predictive File Caching in Mobile Computing , 1994, USENIX Summer.

[52]  Jim Griffioen,et al.  Reducing File System Latency using a Predictive Approach , 1994, USENIX Summer.

[53]  Margo I. Seltzer,et al.  The case for geographical push-caching , 1995, Proceedings 5th Workshop on Hot Topics in Operating Systems (HotOS-V).

[54]  Ravi Jain,et al.  Airdisks and AirRAID: Modeling and scheduling periodic wireless data broadcast (Extended , 1995 .

[55]  Jim Gray,et al.  A critique of ANSI SQL isolation levels , 1995, SIGMOD '95.

[56]  Tomasz Imielinski,et al.  Pyramid broadcasting for video-on-demand service , 1995, Electronic Imaging.

[57]  Ahmed K. Elmagarmid,et al.  Bit-Sequences: A New Cache Invalidation Method in Mobile Environments , 1995 .

[58]  Sunita Sarawagi,et al.  Query Processing in Tertiary Memory Databases , 1995, VLDB.

[59]  Hector Garcia-Molina,et al.  SIFT - a Tool for Wide-Area Information Dissemination , 1995, USENIX.

[60]  Broadcast Disks: Data Management for Asymmetric Communications Environments , 1995, SIGMOD Conference.

[61]  Henry F. Korth,et al.  The Double Life of the Transaction Abstraction: Fundamental Principle and Evolving System Concept , 1995, VLDB.

[62]  Anna R. Karlin,et al.  A study of integrated prefetching and caching strategies , 1995, SIGMETRICS '95/PERFORMANCE '95.

[63]  Jim Zelenka,et al.  Informed prefetching and caching , 1995, SOSP.

[64]  Hui Lei,et al.  Intelligent file hoarding for mobile computers , 1995, MobiCom '95.

[65]  Subramaniyam R. Viswanathan,et al.  Publishing in wireless and wireline environments , 1996 .

[66]  Brian N. Bershad,et al.  A trace-driven comparison of algorithms for parallel prefetching and caching , 1996, OSDI '96.

[67]  Stanley B. Zdonik,et al.  Disseminating Updates on Broadcast Disks , 1996, VLDB.

[68]  Dissemination-Based Information Systems , 1996, IEEE Data Eng. Bull..

[69]  Jeffrey Xu Yu,et al.  Energy efficient filtering of nonuniform broadcast , 1996, Proceedings of 16th International Conference on Distributed Computing Systems.

[70]  Azer Bestavros,et al.  Speculative data dissemination and service to reduce server load, network traffic and service time in distributed information systems , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[71]  Björn Þór Jónsson,et al.  Performance tradeoffs for client-server query processing , 1996, SIGMOD '96.

[72]  Nick Roussopoulos,et al.  Adaptive Data Broadcasting Using Air-Cache , 1996 .

[73]  Nitin H. Vaidya,et al.  Data Broadcast in Asymmetric Wireless Environments , 1996 .

[74]  Azer Bestavros,et al.  AIDA-based real-time fault-tolerant broadcast disks , 1996, Proceedings Real-Time Technology and Applications.

[75]  Philip S. Yu,et al.  Energy-efficient caching for wireless mobile computing , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[76]  Shashi Shekhar,et al.  Genesis: An Approach to Data Dissemination in Advanced Traveler Information Systems , 1996, IEEE Data Eng. Bull..

[77]  Michael J. Franklin,et al.  Client Data Caching: A Foundation for High Performance Object Database Systems , 1996 .

[78]  Stanley Zdonik,et al.  Prefetching from a broadcast disk , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[79]  Nitin H. Vaidya,et al.  Log-time algorithms for scheduling single and multiple channel data broadcast , 1997, MobiCom '97.

[80]  L. Tassiulas,et al.  Broadcast scheduling for information distribution , 1997, Proceedings of INFOCOM '97.

[81]  John S. Baras,et al.  Adaptive Data Broadcast in Hybrid Networks , 1997, VLDB.

[82]  Sanjoy K. Baruah,et al.  Pinwheel scheduling for fault-tolerant broadcast disks in real-time database systems , 1997, Proceedings 13th International Conference on Data Engineering.

[83]  Stanley B. Zdonik,et al.  Balancing push and pull for data broadcast , 1997, SIGMOD '97.

[84]  Leandros Tassiulas,et al.  Optimal Memory Management Strategies for a Mobile User in a Broadcast Data Delivery System , 1997, IEEE J. Sel. Areas Commun..

[85]  Jie Cui Client-Server Performance Evaluation in Pushed-based Systems , 1998 .