ELF: Efficient Lightweight Fast Stream Processing at Scale

Stream processing has become a key means for gaining rapid insights from webserver-captured data. Challenges include how to scale to numerous, concurrently running streaming jobs, to coordinate across those jobs to share insights, to make online changes to job functions to adapt to new requirements or data characteristics, and for each job, to efficiently operate over different time windows. The ELF stream processing system addresses these new challenges. Implemented over a set of agents enriching the web tier of datacenter systems, ELF obtains scalability by using a decentralized "many masters" architecture where for each job, live data is extracted directly from webservers, and placed into memory-efficient compressed buffer trees (CBTs) for local parsing and temporary storage, followed by subsequent aggregation using shared reducer trees (SRTs) mapped to sets of worker processes. Job masters at the roots of SRTs can dynamically customize worker actions, obtain aggregated results for end user delivery and/or coordinate with other jobs. An ELF prototype implemented and evaluated for a larger scale configuration demonstrates scalability, high per-node throughput, sub-second job latency, and sub-second ability to adjust the actions of jobs being run.

[1]  David E. Culler,et al.  The ganglia distributed monitoring system: design, implementation, and experience , 2004, Parallel Comput..

[2]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[3]  Arthur L. Samuel First grade TEX: a beginner''s TEX manual , 1983 .

[4]  Robbert van Renesse,et al.  Astrolabe: A robust and scalable technology for distributed system monitoring, management, and data mining , 2003, TOCS.

[5]  M. Abadi,et al.  Naiad: a timely dataflow system , 2013, SOSP.

[6]  Lorin M. Hitt,et al.  Customized Bundle Pricing for Information Goods: A Nonlinear Mixed-Integer Programming Approach , 2008, Manag. Sci..

[7]  Scott Shenker,et al.  Discretized Streams: An Efficient and Fault-Tolerant Model for Stream Processing on Large Clusters , 2012, HotCloud.

[8]  Aart J. C. Bik,et al.  Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.

[9]  Leonardo Neumeyer,et al.  S4: Distributed Stream Computing Platform , 2010, 2010 IEEE International Conference on Data Mining Workshops.

[10]  Chao Tian,et al.  Nova: continuous Pig/Hadoop workflows , 2011, SIGMOD '11.

[11]  Peter Druschel,et al.  Exploiting network proximity in peer-to-peer overlay networks , 2002 .

[12]  Randy H. Katz,et al.  Chukwa: A System for Reliable Large-Scale Log Collection , 2010, LISA.

[13]  Pete Wyckoff,et al.  Hive - A Warehousing Solution Over a Map-Reduce Framework , 2009, Proc. VLDB Endow..

[14]  Leslie Lamport,et al.  Latex : A Document Preparation System , 1985 .

[15]  Geoffrey C. Fox,et al.  Twister: a runtime for iterative MapReduce , 2010, HPDC '10.

[16]  Miguel Castro,et al.  Scribe: a large-scale and decentralized application-level multicast infrastructure , 2002, IEEE J. Sel. Areas Commun..

[17]  Prashant J. Shenoy,et al.  A platform for scalable one-pass analytics using MapReduce , 2011, SIGMOD '11.

[18]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[19]  Wei Hong,et al.  Proceedings of the 5th Symposium on Operating Systems Design and Implementation Tag: a Tiny Aggregation Service for Ad-hoc Sensor Networks , 2022 .

[20]  Scott Shenker,et al.  Discretized streams: fault-tolerant streaming computation at scale , 2013, SOSP.

[21]  Ying Xing,et al.  The Design of the Borealis Stream Processing Engine , 2005, CIDR.

[22]  Carlo Curino,et al.  Apache Hadoop YARN: yet another resource negotiator , 2013, SoCC.

[23]  Indranil Gupta,et al.  MON: On-Demand Overlays for Distributed System Management , 2005, WORLDS.

[24]  G. Tamkovich The Program. , 1909, California state journal of medicine.

[25]  Karsten Schwan,et al.  Memory-efficient groupby-aggregate using compressed buffer trees , 2013, SoCC.

[26]  Michael J. Franklin,et al.  Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.

[27]  Dejan S. Milojicic,et al.  Moara: Flexible and Scalable Group-Based Querying System , 2008, Middleware.

[28]  Frederick Reiss,et al.  TelegraphCQ: Continuous Dataflow Processing for an Uncertain World , 2003, CIDR.

[29]  Philip S. Yu,et al.  SPADE: the system s declarative stream processing engine , 2008, SIGMOD Conference.

[30]  Zhengping Qian,et al.  TimeStream: reliable stream computation in the cloud , 2013, EuroSys '13.

[31]  Michael Stonebraker,et al.  The Aurora and Medusa Projects , 2003, IEEE Data Eng. Bull..

[32]  Michael Isard,et al.  DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language , 2008, OSDI.

[33]  Lars George,et al.  HBase: The Definitive Guide , 2011 .

[34]  Ying Li,et al.  Microsoft CEP Server and Online Behavioral Targeting , 2009, Proc. VLDB Endow..

[35]  Donald E. Knuth,et al.  The TEX Book , 1984 .

[36]  Ben Y. Zhao,et al.  Tapestry: a resilient global-scale overlay for service deployment , 2004, IEEE Journal on Selected Areas in Communications.

[37]  Michael Stonebraker,et al.  High-availability algorithms for distributed stream processing , 2005, 21st International Conference on Data Engineering (ICDE'05).

[38]  Michael Stonebraker,et al.  Fault-tolerance in the Borealis distributed stream processing system , 2005, SIGMOD '05.

[39]  Robert Morris,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM 2001.

[40]  Leslie Lamport,et al.  LATEX. A document preparation system. User's Guide and Reference Manual , 1996 .

[41]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[42]  Johannes Gehrke,et al.  Cayuga: a high-performance event processing engine , 2007, SIGMOD '07.

[43]  Lu Liu,et al.  Muppet: MapReduce-Style Processing of Fast Data , 2012, Proc. VLDB Endow..

[44]  Daniel Mills,et al.  MillWheel: Fault-Tolerant Stream Processing at Internet Scale , 2013, Proc. VLDB Endow..

[45]  Christopher Olston,et al.  Stateful bulk processing for incremental analytics , 2010, SoCC '10.

[46]  Michael Stonebraker,et al.  The 8 requirements of real-time stream processing , 2005, SGMD.

[47]  Ying Xing,et al.  Scalable Distributed Stream Processing , 2003, CIDR.

[48]  Praveen Yalagandula,et al.  A scalable distributed information management system , 2004, SIGCOMM 2004.

[49]  Leslie Lamport,et al.  L A T E X (2nd ed.): a document preparation system: user's guide and reference manual , 1994 .

[50]  Donald E. Knuth,et al.  Literate Programming , 1984, Comput. J..

[51]  Jay Kreps,et al.  Kafka : a Distributed Messaging System for Log Processing , 2011 .

[52]  Bingsheng He,et al.  Comet: batched stream processing for data intensive distributed computing , 2010, SoCC '10.

[53]  Calton Pu,et al.  Continual Queries for Internet Scale Event-Driven Information Delivery , 1999, IEEE Trans. Knowl. Data Eng..

[54]  Jennifer Widom,et al.  Continuous queries over data streams , 2001, SGMD.

[55]  Lars Arge,et al.  The Buffer Tree: A Technique for Designing Batched External Data Structures , 2003, Algorithmica.

[56]  Brighten Godfrey,et al.  OpenDHT: a public DHT service and its uses , 2005, SIGCOMM '05.

[57]  Pramod Bhatotia,et al.  Incoop: MapReduce for incremental computations , 2011, SoCC.

[58]  Ravi Kumar,et al.  Pig latin: a not-so-foreign language for data processing , 2008, SIGMOD Conference.

[59]  Jun Hu,et al.  Detecting and characterizing social spam campaigns , 2010, IMC '10.

[60]  Helen J. Wang,et al.  An evaluation of scalable application-level multicast built using peer-to-peer overlays , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[61]  Joseph M. Hellerstein,et al.  MapReduce Online , 2010, NSDI.

[62]  Ken Yocum,et al.  In-situ MapReduce for Log Processing , 2011, USENIX Annual Technical Conference.

[63]  Mun Choon Chan,et al.  Meteor Shower: A Reliable Stream Processing System for Commodity Data Centers , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.

[64]  Indranil Gupta,et al.  Q-Tree: A Multi-Attribute Based Range Query Solution for Tele-immersive Framework , 2009, 2009 29th IEEE International Conference on Distributed Computing Systems.

[65]  Frank Dabek,et al.  Large-scale Incremental Processing Using Distributed Transactions and Notifications , 2010, OSDI.