Broadcast Disks: Dissemination-based Data Management for Asymmetric Communication Environments

The increasing ability to interconnect computers through internetworking, wireless networks, high-bandwidth satellite, and cable networks has spawned a new class of information-centered applications based on {\em data dissemination}. These applications, which often serve huge client populations, employ broadcast to efficiently deliver data to the clients. In data dissemination, the data transfer is initiated by the server, inverting the traditional relationship between the client and the server. This thesis proposes a novel ``multi-disk'''' framework for data dissemination called {\em Broadcast Disks}. The Broadcast Disks approach significantly improves upon the previous work in dissemination-based systems and raises a number of fundamentally new research challenges. In this thesis, we first motivate why the rise of {\em asymmetric} environments (i.e., networks which have a significantly higher bandwidth available from servers to clients than in the reverse direction) and the scale of the emerging distributed information systems is causing a shift from the traditional {\em pull-based} client-server model to a {\em push-based} dissemination model. Then, the Broadcast Disks model is introduced and explored using a simulation-based study and a working implementation. The bulk of this thesis uses simulation results to understand the basic tradeoffs in a dissemination framework. We demonstrate the performance benefits of the Broadcast Disks model over the traditional approach to structuring a dissemination program. We also introduce a new {\em cost-based} approach to client cache management and develop efficient algorithms for prefetching and dissemination of updates. Then, the thesis addresses the issues that arise in supporting clients using both push and pull-based data delivery. Finally, we describe our experience in building and testing a Broadcast Disks prototype. While the studies using the prototype validate the simulation-based intuitions, they also raise many new issues and highlight some of the shortcomings of the current technology for building push-based systems. The underlying theme driving the studies in this thesis is to develop techniques to improve system performance and scalability which adapt to the new tradeoffs in the emerging computing landscape.

[1]  Leandros Tassiulas,et al.  Optimal Memory Management Strategies for a Mobile User in a Broadcast Data Delivery System , 1997, IEEE J. Sel. Areas Commun..

[2]  Shashi Shekhar,et al.  Genesis: An Approach to Data Dissemination in Advanced Traveler Information Systems , 1996, IEEE Data Eng. Bull..

[3]  Hui Lei,et al.  Intelligent file hoarding for mobile computers , 1995, MobiCom '95.

[4]  Gerhard Weikum,et al.  The LRU-K page replacement algorithm for database disk buffering , 1993, SIGMOD Conference.

[5]  Hector Garcia-Molina,et al.  Consistency in a partitioned network: a survey , 1985, CSUR.

[6]  Mostafa H. Ammar Response Time in a Teletext System: An Individual User's Perspective , 1987, IEEE Trans. Commun..

[7]  Hector Garcia-Molina,et al.  SIFT - a Tool for Wide-Area Information Dissemination , 1995, USENIX.

[8]  Shashi Shekhar,et al.  Genesis and Advanced Traveler Information Systems , 1994, Mobidata.

[9]  Björn Þór Jónsson,et al.  Performance tradeoffs for client-server query processing , 1996, SIGMOD '96.

[10]  J. Wong,et al.  Broadcast Delivery , 1988, Proc. IEEE.

[11]  Tomasz Imielinski,et al.  Power Efficient Filtering of Data an Air , 1994, EDBT.

[12]  Abraham Silberschatz,et al.  Distributed file systems: concepts and examples , 1990, CSUR.

[13]  Tomasz Imielinski,et al.  Impact of mobility on distributed computations , 1993, OPSR.

[14]  Sunita Sarawagi,et al.  Query Processing in Tertiary Memory Databases , 1995, VLDB.

[15]  Michael N. Nelson,et al.  Caching in the Sprite network file system , 1988, TOCS.

[16]  S. Zdonik,et al.  Are "disks in the air" just pie in the sky? , 1994, Workshop on Mobile Computing Systems and Applications.

[17]  Miron Livny,et al.  Local Disk Caching for Client-Server Database Systems , 1993, VLDB.

[18]  Azer Bestavros,et al.  Speculative data dissemination and service to reduce server load, network traffic and service time in distributed information systems , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[19]  Tomasz Imielinski,et al.  Mobile wireless computing: challenges in data management , 1994, CACM.

[20]  Mostafa H. Ammar,et al.  On the optimality of cyclic transmission in teletext systems , 1985, 1985 24th IEEE Conference on Decision and Control.

[21]  Margo I. Seltzer,et al.  The case for geographical push-caching , 1995, Proceedings 5th Workshop on Hot Topics in Operating Systems (HotOS-V).

[22]  Kenneth Baclawski,et al.  Quickly generating billion-record synthetic databases , 1994, SIGMOD '94.

[23]  Tomasz Imielinski,et al.  Sleepers and workaholics: caching strategies in mobile environments , 1994, SIGMOD '94.

[24]  Stanley B. Zdonik,et al.  Balancing push and pull for data broadcast , 1997, SIGMOD '97.

[25]  Geoffrey H. Kuenning,et al.  An Analysis of Trace Data for Predictive File Caching in Mobile Computing , 1994, USENIX Summer.

[26]  David K. Gifford,et al.  Polychannel systems for mass digital communications , 1990, Commun. ACM.

[27]  Philip S. Yu,et al.  Energy-efficient caching for wireless mobile computing , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[28]  Rafael Alonso,et al.  Broadcast Disks: Data Management for Asymmetric Communication Environments , 1994, Mobidata.

[29]  Rafael Alonso,et al.  Data caching issues in an information retrieval system , 1990, TODS.

[30]  Calton Pu,et al.  Replica control in distributed systems: as asynchronous approach , 1991, SIGMOD '91.

[31]  Nick Roussopoulos,et al.  Adaptive Data Broadcasting Using Air-Cache , 1996 .

[32]  Mahadev Satyanarayanan,et al.  Disconnected Operation in the Coda File System , 1999, Mobidata.

[33]  Nitin H. Vaidya,et al.  Log-time algorithms for scheduling single and multiple channel data broadcast , 1997, MobiCom '97.

[34]  Anna R. Karlin,et al.  A study of integrated prefetching and caching strategies , 1995, SIGMETRICS '95/PERFORMANCE '95.

[35]  Ahmed K. Elmagarmid,et al.  Bit-Sequences: A New Cache Invalidation Method in Mobile Environments , 1995 .

[36]  Jim Gray,et al.  A critique of ANSI SQL isolation levels , 1995, SIGMOD '95.

[37]  K. Salem,et al.  Placing Replicated Data to Reduce Seek Delays Y Placing Replicated Data to Reduce Seek Delays , 1991 .

[38]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[39]  W. Richard Stevens,et al.  TCP/IP Illustrated, Volume 1: The Protocols , 1994 .

[40]  Azer Bestavros,et al.  AIDA-based real-time fault-tolerant broadcast disks , 1996, Proceedings Real-Time Technology and Applications.

[41]  Garth A. Gibson,et al.  Exposing I/O concurrency with informed prefetching , 1994, Proceedings of 3rd International Conference on Parallel and Distributed Information Systems.

[42]  Bill Nitzberg,et al.  Distributed shared memory: a survey of issues and algorithms , 1991, Computer.

[43]  P. Krishnan,et al.  Practical prefetching via data compression , 1993 .

[44]  John S. Baras,et al.  Adaptive Data Broadcast in Hybrid Networks , 1997, VLDB.

[45]  Stanley Zdonik,et al.  Prefetching from a broadcast disk , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[46]  Leandros Tassiulas,et al.  Broadcast scheduling for information distribution , 1999, Wirel. Networks.

[47]  Ravi Jain,et al.  Airdisks and AirRAID: Modeling and scheduling periodic wireless data broadcast (Extended , 1995 .

[48]  Mostafa H. Ammar,et al.  The Design of Teletext Broadcast Cycles , 1985, Perform. Evaluation.

[49]  Abraham Silberschatz,et al.  Operating System Concepts, 3rd Edition , 1991 .

[50]  J. T. Robinson,et al.  Data cache management using frequency-based replacement , 1990, SIGMETRICS '90.

[51]  Dennis Shasha,et al.  2Q: A Low Overhead High Performance Buffer Management Replacement Algorithm , 1994, VLDB.

[52]  Stanley B. Zdonik,et al.  Fido: A Cache That Learns to Fetch , 1991, VLDB.

[53]  Mostafa H. Ammar,et al.  Analysis of Broadcast Delivery in a Videotex System , 1985, IEEE Transactions on Computers.

[54]  Michael J. Franklin Caching and Memory Management in Client-Server Database Systems , 1993 .

[55]  Jie Cui Client-Server Performance Evaluation in Pushed-based Systems , 1998 .

[56]  Henry F. Korth,et al.  The Double Life of the Transaction Abstraction: Fundamental Principle and Evolving System Concept , 1995, VLDB.

[57]  E. F. Codd,et al.  A Relational Model for Large Shared Data Banks , 1970 .

[58]  Tomasz Imielinski,et al.  Pyramid broadcasting for video-on-demand service , 1995, Electronic Imaging.

[59]  Stanley B. Zdonik,et al.  Dissemination-Based Information Systems , 1996, IEEE Data Eng. Bull..

[60]  David K. Gifford,et al.  An Architecture for Large Scale Information Systems , 1985, SOSP.

[61]  Nick Roussopoulos,et al.  Modern client-server DBMS architectures , 1991, SGMD.

[62]  Nick Roussopoulos,et al.  Performance and Scalability of Client-Server Database Architectures , 1992, VLDB.

[63]  Michael Stonebraker,et al.  Architecture of future data base systems , 1981, SGMD.

[64]  Andreas Reuter,et al.  Transaction Processing: Concepts and Techniques , 1992 .

[65]  Nitin H. Vaidya,et al.  Data Broadcast in Asymmetric Wireless Environments , 1996 .

[66]  Tomasz Imielinski,et al.  Energy efficient indexing on air , 1994, SIGMOD '94.

[67]  Gita Gopal,et al.  The datacycle architecture for very high throughput database systems , 1987, SIGMOD '87.

[68]  Subramaniyam R. Viswanathan,et al.  Publishing in wireless and wireline environments , 1996 .

[69]  Miron Livny,et al.  Data caching tradeoffs in client-server DBMS architectures , 1991, SIGMOD '91.

[70]  Philip S. Yu,et al.  The Effect of Skewed Data Access on Buffer Hits and Data Contention an a Data Sharing Environment , 1990, VLDB.

[71]  Herb Schwetman,et al.  CSIM: a C-based process-oriented simulation language , 1986, WSC '86.

[72]  Jim Griffioen,et al.  Reducing File System Latency using a Predictive Approach , 1994, USENIX Summer.

[73]  Kevin Wilkinson,et al.  Maintaining Consistency of Client-Cached Data , 1990, VLDB.

[74]  Sanjoy K. Baruah,et al.  Pinwheel scheduling for fault-tolerant broadcast disks in real-time database systems , 1997, Proceedings 13th International Conference on Data Engineering.

[75]  Michael J. Carey,et al.  Client-Server Caching Revisited , 1998, IWDOM.

[76]  Michael J. Franklin,et al.  Client Data Caching: A Foundation for High Performance Object Database Systems , 1996 .

[77]  Brian N. Bershad,et al.  A trace-driven comparison of algorithms for parallel prefetching and caching , 1996, OSDI '96.

[78]  John H. Howard,et al.  On Overview of the Andrew File System , 1988, USENIX Winter.

[79]  Tomasz Imielinski,et al.  ADAPTIVE WIRELESS INFORMATION SYSTEMS , 1994 .

[80]  Jim Zelenka,et al.  Informed prefetching and caching , 1995, SOSP.

[81]  Douglas B. Terry,et al.  Continuous queries over append-only databases , 1992, SIGMOD '92.

[82]  James K. Archibald,et al.  Cache coherence protocols: evaluation using a multiprocessor simulation model , 1986, TOCS.