Transactions on Large-Scale Data- and Knowledge-Centered Systems XVII

Data Warehouses (DW) store valuable information not only for strategic business decisions, but also for operational daily decisions. As a consequence, a large number of queries are concurrently submitted, stressing the database engine ability to handle such query workloads without severely degrading query response times. The query-at-time model of common database engines, where each query is independently executed and competes for the same resources, is inefficient for handling large DWs and does not provides the expected performance and scalability when processing large numbers of concurrent queries. Related work shows that there’s a performance advantage on sharing data and processing, but the proposed solutions suffer from memory limitations, reduced scalability and unpredictable execution times when applied to large DWs, particularly those with large dimensions. SPIN proposes an approach to share computation and data among concurrent queries that delivers scale-up, even in the presence of massive query workloads. In this paper we describe the mechanisms used by SPIN to embed data and queries into a shared query processing pipeline tree and how SPIN dynamically reorganizes the processing tree. We also provide experimental validation of the approach.

[1]  Yuanyuan Zhao,et al.  Exploitng event stream interpretation in publish-subscribe systems , 2001, PODC '01.

[2]  P. Spirtes,et al.  Causality From Probability , 1989 .

[3]  J. Pearl Causal diagrams for empirical research , 1995 .

[4]  Theodore Johnson,et al.  Out-of-order processing: a new architecture for high-performance stream systems , 2008, Proc. VLDB Endow..

[5]  David Heckerman,et al.  A Bayesian Approach to Learning Causal Networks , 1995, UAI.

[6]  S. Kullback,et al.  Information Theory and Statistics , 1959 .

[7]  Judea Pearl,et al.  Causal networks: semantics and expressiveness , 2013, UAI.

[8]  David Maxwell Chickering,et al.  Learning Equivalence Classes of Bayesian Network Structures , 1996, UAI.

[9]  Elke A. Rundensteiner,et al.  Event Stream Processing with Out-of-Order Data Arrival , 2007, 27th International Conference on Distributed Computing Systems Workshops (ICDCSW'07).

[10]  Christopher Meek,et al.  Learning Bayesian Networks with Discrete Variables from Data , 1995, KDD.

[11]  B. Prakasa Rao Conditional independence, conditional mixing and conditional association , 2009 .

[12]  Luis M. de Campos,et al.  A Scoring Function for Learning Bayesian Networks based on Mutual Information and Conditional Independence Tests , 2006, J. Mach. Learn. Res..

[13]  Elke A. Rundensteiner,et al.  Sequence Pattern Query Processing over Out-of-Order Event Streams , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[14]  J. Pearl Graphs, Causality, and Structural Equation Models , 1998 .

[15]  H. Teicher,et al.  Probability theory: Independence, interchangeability, martingales , 1978 .

[16]  Constantin F. Aliferis,et al.  The max-min hill-climbing Bayesian network structure learning algorithm , 2006, Machine Learning.

[17]  Howard J. Hamilton,et al.  The TIMERS II Algorithm for the Discovery of Causality , 2005, PAKDD.

[18]  Bernard Manderick,et al.  Learning Causal Bayesian Networks from Observations and Experiments: A Decision Theoretic Approach , 2006, MDAI.

[19]  Tze-Yun Leong,et al.  Active Learning for Causal Bayesian Network Structure with Non-symmetrical Entropy , 2009, PAKDD.

[20]  David A. Bell,et al.  Learning Bayesian networks from data: An information-theory based approach , 2002, Artif. Intell..

[21]  Byung Suk Lee,et al.  Fast Causal Network Inference over Event Streams , 2013, DaWaK.

[22]  Theodore Johnson,et al.  Monitoring Regular Expressions on Out-of-Order Streams , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[23]  Dan Geiger,et al.  On the logic of causal models , 2013, UAI.

[24]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[25]  Yue Yu,et al.  A query-matching mechanism over out-of-order event stream in IOT , 2013, Int. J. Ad Hoc Ubiquitous Comput..

[26]  Jonathan Goldstein,et al.  Consistent Streaming Through Time: A Vision for Event Stream Processing , 2006, CIDR.

[27]  Nahla Ben Amor,et al.  Learning Causal Bayesian Networks from Incomplete Observational Data and Interventions , 2007, ECSQARU.

[28]  Stephen E. Fienberg,et al.  Discrete Multivariate Analysis: Theory and Practice , 1976 .

[29]  P. Spirtes,et al.  Causation, prediction, and search , 1993 .

[30]  A. Cano,et al.  A Score Based Ranking of the Edges for the PC Algorithm , 2008 .

[31]  W. Wong,et al.  Learning Causal Bayesian Network Structures From Experimental Data , 2008 .