Database Systems for Advanced Applications

Network-based services have become a ubiquitous part of our lives, to the point where individuals and businesses have often come to critically rely on them. Building and maintaining such reliable, high performance network and service infrastructures requires the ability to rapidly investigate and resolve complex service and performance impacting issues. To achieve this, it is important to collect, correlate and analyze massive amounts of data from a diverse collection of data sources in real time. We have designed and implemented a variety of data systems at AT&T Labs-Research to build highly scalable databases that support real time data collection, correlation and analysis, including (a) the Daytona data management system, (b) the DataDepot data warehousing system, (c) the GS tool data stream management system, and (d) the Bistro data feed manager. Together, these data systems have enabled the creation and maintenance of a data warehouse and data analysis infrastructure for troubleshooting complex issues in the network. We describe these data systems and their key research contributions in this talk. S.-g. Lee et al. (Eds.): DASFAA 2012, Part I, LNCS 7238, p. 1, 2012. c © Springer-Verlag Berlin Heidelberg 2012 S.-g. Lee et al. (Eds.): DASFAA 2012, Part I, LNCS 7238, p. 2, 2012. © Springer-Verlag Berlin Heidelberg 2012 A New Paradigm of Thinking and Architecture for Real-Time Information Processing at Fingertips

[1]  Todd Eavis,et al.  Rk-hist: an r-tree based histogram for multi-dimensional selectivity estimation , 2007, CIKM '07.

[2]  Jeffrey F. Naughton,et al.  Practical selectivity estimation through adaptive sampling , 1990, SIGMOD '90.

[3]  Deok-Hwan Kim,et al.  Multi-dimensional selectivity estimation using compressed histogram information , 1999, SIGMOD '99.

[4]  Luis Gravano,et al.  STHoles: a multidimensional workload-aware histogram , 2001, SIGMOD '01.

[5]  Sudipto Guha,et al.  Dynamic multidimensional histograms , 2002, SIGMOD '02.

[6]  Sudipto Guha,et al.  REHIST: Relative Error Histogram Construction Algorithms , 2004, VLDB.

[7]  Sridhar Ramaswamy,et al.  Selectivity estimation in spatial databases , 1999, SIGMOD '99.

[8]  Peter J. Haas,et al.  Sequential sampling procedures for query size estimation , 1992, SIGMOD '92.

[9]  Dimitrios Gunopulos,et al.  Selectivity estimators for multidimensional range queries over real attributes , 2005, The VLDB Journal.

[10]  Bernhard Seeger,et al.  A comparison of selectivity estimators for range queries on metric attributes , 1999, SIGMOD '99.

[11]  David J. DeWitt,et al.  Equi-depth multidimensional histograms , 1988, SIGMOD '88.

[12]  Yannis E. Ioannidis,et al.  Selectivity Estimation Without the Attribute Value Independence Assumption , 1997, VLDB.

[13]  Surajit Chaudhuri,et al.  Self-tuning histograms: building histograms without looking at data , 1999, SIGMOD '99.

[14]  Peter J. Haas,et al.  ISOMER: Consistent Histogram Construction Using Query Feedback , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[15]  Jeffrey Scott Vitter,et al.  Data cube approximation and histograms via wavelets , 1998, CIKM '98.

[16]  Yon Dohn Chung,et al.  Hierarchically organized skew-tolerant histograms for geographic data objects , 2010, SIGMOD Conference.

[17]  Jeffrey Scott Vitter,et al.  Wavelet-based histograms for selectivity estimation , 1998, SIGMOD '98.

[18]  Yannis E. Ioannidis,et al.  The History of Histograms (abridged) , 2003, VLDB.