Applications that require real-time processing of high-volume data steams are pushing the limits of traditional data processing infrastructures. These stream-based applications include market feed processing and electronic trading on Wall Street, network and infrastructure monitoring, fraud detection, and command and control in military environments. Furthermore, as the "sea change" caused by cheap micro-sensor technology takes hold, we expect to see everything of material significance on the planet get "sensor-tagged" and report its state or location in real time. This sensorization of the real world will lead to a "green field" of novel monitoring and control applications with high-volume and low-latency processing requirements.Recently, several technologies have emerged---including off-the-shelf stream processing engines---specifically to address the challenges of processing high-volume, real-time data without requiring the use of custom code. At the same time, some existing software technologies, such as main memory DBMSs and rule engines, are also being "repurposed" by marketing departments to address these applications.In this paper, we outline eight requirements that a system software should meet to excel at a variety of real-time stream processing applications. Our goal is to provide high-level guidance to information technologists so that they will know what to look for when evaluation alternative stream processing solutions. As such, this paper serves a purpose comparable to the requirements papers in relational DBMSs and on-line analytical processing. We also briefly review alternative system software technologies in the context of our requirements.The paper attempts to be vendor neutral, so no specific commercial products are mentioned.
[1]
Elaine Kant,et al.
Programming expert systems in OPS5
,
1985
.
[2]
Nancy Martin,et al.
Programming Expert Systems in OPS5 - An Introduction to Rule-Based Programming(1)
,
1985,
Int. CMG Conference.
[3]
Jim Gray,et al.
Fault Tolerance in Tandem Computer Systems
,
1987
.
[4]
E. F. Codd,et al.
Providing OLAP to User-Analysts: An IT Mandate
,
1998
.
[5]
Michael Stonebraker,et al.
Monitoring Streams - A New Class of Data Management Applications
,
2002,
VLDB.
[6]
Jennifer Widom,et al.
STREAM: the stanford stream data manager (demonstration description)
,
2003,
SIGMOD '03.
[7]
Jennifer Widom,et al.
STREAM: The Stanford Stream Data Manager
,
2003,
IEEE Data Eng. Bull..
[8]
Frederick Reiss,et al.
TelegraphCQ: Continuous Dataflow Processing for an Uncertain World
,
2003,
CIDR.
[9]
Michael Stonebraker,et al.
Linear Road: A Stream Data Management Benchmark
,
2004,
VLDB.
[10]
Qiang Chen,et al.
Aurora : a new model and architecture for data stream management )
,
2006
.