论文信息 - Load Shedding in Data Stream Systems

Load Shedding in Data Stream Systems

Systems for processing continuous monitoring queries over data streams must be adaptive because data streams are often bursty and data characteristics may vary over time. In this chapter, we focus on one particular type of adaptivity: the ability to gracefully degrade performance via “load shedding” (dropping unprocessed tuples to reduce system load) when the demands placed on the system cannot be met in full given available resources. Focusing on aggregation queries, we present algorithms that determine at what points in a query plan should load shedding be performed and what amount of load should be shed at each point in order to minimize the degree of inaccuracy introduced into query answers. We also discuss strategies for load shedding for other types of queries (set-valued queries, join queries, and classification queries).

Rajeev Motwani | Brian Babcock | Mayur Datar

[1] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .

[2] Michael Stonebraker,et al. Load Shedding in a Data Stream Manager , 2003, VLDB.

[3] Philip S. Yu,et al. Loadstar: A Load Shedding Scheme for Classifying Data Streams , 2005, SDM.

[4] Rajeev Motwani,et al. Load shedding for aggregation queries over data streams , 2004, Proceedings. 20th International Conference on Data Engineering.

[5] Jeffrey F. Naughton,et al. Evaluating window joins over unbounded streams , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[6] Abhinandan Das,et al. Approximate join processing over data streams , 2003, SIGMOD '03.

[7] Michael Stonebraker,et al. Monitoring Streams - A New Class of Data Management Applications , 2002, VLDB.

[8] Rajeev Motwani,et al. Processing continuous queries over streaming data with limited system resources , 2006 .