Continuous analytics: data stream query processing in practice

Stream query processing has been one of the more popular topics in database research so far this century. The basic idea is to provide database-style query processing over data on-the-fly as they arrive at the system,. Compared to the store-first, query-later approach followed by traditional database systems, stream query processing holds the promise for dramatically improved efficiency and reduced latency. Work in this area was originally motivated by "real-time" data-intensive scenarios such as sensor networks, financial trading applications, and network security. Lately, stream processing has been moving from the research lab into the real world through efforts at start-up companies, traditional database vendors, and open source projects. Not surprisingly, the practical uses and advantages of the technology are turning out to be different than many had originally expected. In this talk, I'll survey the state of the art in stream query processing and related technologies such as event processing, discuss some of the implications for data-intensive system architectures, and provide my views on the future role of this technology from both a research and a commercial perspective. In particular, I'll describe the notion of Continuous Analytics, which leverages Stream Query Processing techniques to solve some of the inherent bottlenecks that have existed in database systems since their inception. I will also discuss several implementation issues that arose through experience with specific application deployments including the need to handle out-of-order data and high-cardinality dimensions.