Oracle Workload Intelligence

Analyzing and understanding the characteristics of the incoming workload is crucial in unraveling trends and tuning the performance of a database system. In this work, we present Oracle Workload Intelligence (WI), a tool for workload modeling and mining, as our attempt to infer the processes that generate a given workload. WI consists of two main functionalities. First, WI derives a model that captures the main characteristics of the workload without overfitting, which makes it likely to generalize well to unseen instances of the workload. Such a model provides insights into the most frequent code paths in the application that drives the workload, and also enables optimizations inside the database system that target sequences of query statements. Second, WI can compare the models of different snapshots of the workload to detect whether the workload has changed. Such changes might indicate new trends, regressions, problems, or even security issues. We demonstrate the effectiveness of WI with an experimental study on synthetic workloads and customer-provided application benchmarks.

[1]  Peter Grünwald,et al.  A tutorial introduction to the minimum description length principle , 2004, ArXiv.

[2]  Mayank Sachan,et al.  Mining Statistically Significant Substrings using the Chi-Square Statistic , 2012, Proc. VLDB Endow..

[3]  Philip S. Yu,et al.  On Workload Characterization of Relational Database Environments , 1992, IEEE Trans. Software Eng..

[4]  Justin Zobel,et al.  B-tries for disk-based string management , 2008, The VLDB Journal.

[5]  Philippe Flajolet,et al.  Dynamical Sources in Information Theory : A General Analysis of Trie Structures , 1999 .

[6]  Ramakrishnan Srikant,et al.  Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.

[7]  Stanley B. Zdonik,et al.  On Predictive Modeling for Optimizing Transaction Execution in Parallel OLTP Systems , 2011, Proc. VLDB Endow..

[8]  D. Vere-Jones Markov Chains , 1972, Nature.

[9]  Klaus Berberich,et al.  Mind the gap: large-scale frequent sequence mining , 2013, SIGMOD '13.

[10]  Carsten Sapia,et al.  PROMISE: Predicting Query Behavior to Enable Predictive Caching Strategies for OLAP Systems , 2000, DaWaK.

[11]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[12]  Philip A. Bernstein,et al.  Adapting microsoft SQL server for cloud computing , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[13]  Heikki Mannila,et al.  Discovery of Frequent Episodes in Event Sequences , 1997, Data Mining and Knowledge Discovery.

[14]  Philip S. Yu,et al.  Characterization of database access pattern for analytic prediction of buffer hit probability , 2005, The VLDB Journal.

[15]  Mikhail J. Atallah,et al.  Detection of significant sets of episodes in event sequences , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[16]  Surajit Chaudhuri,et al.  Primitives for Workload Summarization and Implications for SQL , 2003, VLDB.