Valmar: High-bandwidth real-time streaming data management

In applications ranging from radio telescopes to Internet traffic monitoring, our ability to generate data has outpaced our ability to effectively capture, mine, and manage it. These ultra-high-bandwidth data streams typically contain little useful information and most of the data can be safely discarded. Periodically, however, an event of interest is observed and a large segment of the data must be preserved, including data preceding detection of the event. Doing so requires guaranteed data capture at source rates, line speed filtering to detect events and data points of interest, and TiVo-like ability to save past data once an event has been detected. We present Valmar, a system for guaranteed capture, indexing, and storage of ultra-high-bandwidth data streams. Our results show that Valmar performs at nearly full disk bandwidth, up to several orders of magnitude faster than flat file and database systems, works well with both small and large data elements, and allows concurrent read and search access without compromising data capture guarantees.

[1]  Ragunathan Rajkumar,et al.  Real-time filesystems. Guaranteeing timing constraints for disk accesses in RT-Mach , 1997, Proceedings Real-Time Systems Symposium.

[2]  Alan Jay Smith,et al.  The automatic improvement of locality in storage systems , 2005, TOCS.

[3]  Scott A. Brandt,et al.  Horizon: efficient deadline-driven disk I/O management for distributed storage systems , 2010, HPDC '10.

[4]  Pierre Jouvelot,et al.  Semantic file systems , 1991, SOSP '91.

[5]  Scott A. Brandt,et al.  Mahanaxar: Quality of service guarantees in high-bandwidth, real-time streaming data storage , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[6]  William E. Weihl,et al.  Lottery scheduling: flexible proportional-share resource management , 1994, OSDI '94.

[7]  Anja Feldmann,et al.  Building a time machine for efficient recording and retrieval of high-volume network traffic , 2005, IMC '05.

[8]  Carlos Maltzahn,et al.  Efficient guaranteed disk request scheduling with fahrrad , 2008, Eurosys '08.

[9]  Carlos Maltzahn,et al.  Fusing data management services with file systems , 2009, PDSW '09.

[10]  Sape J. Mullender,et al.  Clockwise: a mixed-media file system , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[11]  Tony Fountain,et al.  The Ring Buffer Network Bus (RBNB) DataTurbine Streaming Data Middleware for Environmental Observing Systems , 2007, Third IEEE International Conference on e-Science and Grid Computing (e-Science 2007).

[12]  Craig A. N. Soules,et al.  Connections: using context to enhance file search , 2005, SOSP '05.

[13]  Reagan Moore,et al.  Accessing sensor data using meta data: a virtual object ring buffer framework , 2005, DMSN '05.

[14]  Martin Pohlack,et al.  Rotational-position-aware real-time disk scheduling using a dynamic active subset (DAS) , 2003, RTSS 2003. 24th IEEE Real-Time Systems Symposium, 2003.

[15]  Scott A. Brandt,et al.  Providing Quality of Service Support in Object-Based File System , 2007, 24th IEEE Conference on Mass Storage Systems and Technologies (MSST 2007).