Hot DB'02 course project reports : hot topics in database systems, fall 2002

This technical report contains eight final project reports contributed by ten participants in "Hot Topics in Database Systems," a CMU advanced graduate course offered by Professor Anastassia Ailamaki in Fall 2002. The course covers advanced research issues in modern database system design through paper presentations and discussion. In Fall 2002, topics included query optimization, data stream and adaptive query processing, continuous queries, self-tuning database systems, interaction between the database software and the underlying hardware, distributed and peer-to-peer DBMS, and XML applications. The participating students studied and evaluated the cutting-edge research papers from each topic, and addressed the related issues in in-class discussions. Inspired by the course material, the students proposed and carried out a total of eight projects. The projects included innovative research and implementation and the students worked in teams of two or by themselves. The project reports were carefully evaluated by the students using a conference program committee-style blind-review process. The resulting camera-ready papers are available in this technical report. Several of these reports (as noted in their first page) have resulted in conference submissions. The projects were presented using posters and demos during a half-day HotDB workshop that was held at Carnegie Mellon on December 10, 2002.

[1]  Michael J. Carey,et al.  A Study of Index Structures for a Main Memory Database Management System , 1986, HPTS.

[2]  Martin L. Kersten,et al.  Database Architecture Optimized for the New Bottleneck: Memory Access , 1999, VLDB.

[3]  James R. Larus,et al.  Making Pointer-Based Data Structures Cache Conscious , 2000, Computer.

[4]  Michael A. Bender,et al.  Cache-oblivious B-trees , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[5]  Philippe Bonnet,et al.  Towards Sensor Database Systems , 2001, Mobile Data Management.

[6]  Yossi Matias,et al.  DIMACS Series in Discrete Mathematicsand Theoretical Computer Science Synopsis Data Structures for Massive Data , 2007 .

[7]  Jeffrey F. Naughton,et al.  Sampling-Based Estimation of the Number of Distinct Values of an Attribute , 1995, VLDB.

[8]  Jennifer Widom,et al.  Continuous queries over data streams , 2001, SGMD.

[9]  S. Sudarshan,et al.  Pipelining in multi-query optimization , 2001, PODS '01.

[10]  Todd C. Mowry,et al.  Improving index performance through prefetching , 2001, SIGMOD '01.

[11]  Timos K. Sellis,et al.  Multiple-query optimization , 1988, TODS.

[12]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[13]  Goetz Graefe,et al.  B-tree indexes and CPU caches , 2001, Proceedings 17th International Conference on Data Engineering.

[14]  Jayant R. Haritsa,et al.  Plan Selection Based on Query Clustering , 2002, VLDB.

[15]  Chun Zhang,et al.  Automating physical database design in a parallel database , 2002, SIGMOD '02.

[16]  Srinivasan Seshan,et al.  Cache-and-query for wide area sensor databases , 2003, SIGMOD '03.

[17]  Srikanta Tirthapura,et al.  Distributed Streams Algorithms for Sliding Windows , 2002, SPAA '02.

[18]  Gary Valentin,et al.  Fractal prefetching B+-Trees: optimizing both cache and disk performance , 2002, SIGMOD '02.

[19]  Mervin E. Muller,et al.  Development of Sampling Plans by Using Sequential (Item by Item) Selection Techniques and Digital Computers , 1962 .

[20]  Goetz Graefe Iterators, Schedulers, and Distributed-memory Parallelism , 1996, Softw. Pract. Exp..

[21]  Philippe Flajolet,et al.  Probabilistic Counting Algorithms for Data Base Applications , 1985, J. Comput. Syst. Sci..

[22]  Volker Markl,et al.  LEO - DB2's LEarning Optimizer , 2001, VLDB.

[23]  Jeffrey Scott Vitter,et al.  Wavelet-based histograms for selectivity estimation , 1998, SIGMOD '98.

[24]  Joseph M. Hellerstein,et al.  Interactive query processing , 2001 .

[25]  David J. DeWitt,et al.  The Wisconsin Benchmark: Past, Present, and Future , 1991, The Benchmark Handbook.

[26]  Surajit Chaudhuri,et al.  Automated Selection of Materialized Views and Indexes in SQL Databases , 2000, VLDB.

[27]  Surajit Chaudhuri,et al.  Index merging , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[28]  Michael J. Franklin,et al.  Streaming Queries over Streaming Data , 2002, VLDB.

[29]  Peter J. Haas,et al.  Improved histograms for selectivity estimation of range predicates , 1996, SIGMOD '96.

[30]  Ben Taskar,et al.  Selectivity estimation using probabilistic models , 2001, SIGMOD '01.

[31]  R Shortland,et al.  Data Mining Applications , 1996 .

[32]  Kenneth A. Ross,et al.  Making B+- trees cache conscious in main memory , 2000, SIGMOD '00.

[33]  Kyu-Young Whang,et al.  A linear-time probabilistic counting algorithm for database applications , 1990, TODS.

[34]  Christos Faloutsos,et al.  NetCube: A Scalable Tool for Fast Data Mining and Compression , 2001, VLDB.

[35]  Yossi Matias,et al.  Fast incremental maintenance of approximate histograms , 1997, TODS.

[36]  Michael Stonebraker,et al.  Monitoring Streams - A New Class of Data Management Applications , 2002, VLDB.

[37]  Surajit Chaudhuri,et al.  Self-tuning histograms: building histograms without looking at data , 1999, SIGMOD '99.

[38]  Charles E. Leiserson,et al.  Cache-Oblivious Algorithms , 2003, CIAC.

[39]  G. Casella,et al.  Statistical Inference , 2003, Encyclopedia of Social Network Analysis and Mining.

[40]  Martin Roesch,et al.  Snort - Lightweight Intrusion Detection for Networks , 1999 .

[41]  David J. DeWitt,et al.  Equi-depth multidimensional histograms , 1988, SIGMOD '88.

[42]  Yannis E. Ioannidis,et al.  Selectivity Estimation Without the Attribute Value Independence Assumption , 1997, VLDB.

[43]  George Varghese,et al.  Counting the number of active flows on a high speed link , 2002, CCRV.

[44]  Srinivasan Seshan,et al.  IrisNet: An Architecture for Compute-Intensive Wide-Area Sensor Network Services , 2002 .

[45]  Nick Roussopoulos,et al.  Adaptive selectivity estimation using query feedback , 1994, SIGMOD '94.

[46]  Luis Gravano,et al.  STHoles: a multidimensional workload-aware histogram , 2001, SIGMOD '01.

[47]  David J. DeWitt,et al.  DBMSs on a Modern Processor: Where Does Time Go? , 1999, VLDB.

[48]  Jing Wu,et al.  A locality-preserving cache-oblivious dynamic dictionary , 2002, SODA '02.

[49]  Samuel Madden,et al.  Continuously adaptive continuous queries over streams , 2002, SIGMOD '02.

[50]  Surajit Chaudhuri,et al.  An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server , 1997, VLDB.

[51]  Piotr Indyk,et al.  Comparing Data Streams Using Hamming Norms (How to Zero In) , 2002, VLDB.

[52]  Anastasia Ailamaki,et al.  A Case for Staged Database Systems , 2003, CIDR.

[53]  Rajeev Motwani,et al.  Towards estimation error guarantees for distinct values , 2000, PODS.

[54]  David J. DeWitt,et al.  Weaving Relations for Cache Performance , 2001, VLDB.

[55]  Rajeev Rastogi,et al.  Main-memory index structures with fixed-size partial keys , 2001, SIGMOD '01.

[56]  James R. Larus,et al.  Cache-conscious structure layout , 1999, PLDI '99.

[57]  Kenneth A. Ross,et al.  Cache Conscious Indexing for Decision-Support in Main Memory , 1999, VLDB.