A Sleep-based Communication Mechanism to Save Processor Utilization in Distributed Streaming Systems

Energy-efficiency of applications deployed in data-centers is becoming increasingly important. Techniques that reduce CPU utilization for specific workloads can help improve energy consumption. An application domain that has been studied in the past extensively and is lately gaining importance in data-centers is distributed stream processing. In this work we examine an existing stream processing system, Borealis [6], and we identify significant sources of overhead in the communication stack. Specifically, we examine the inter-node communication path in a distributed setup and the overheads associated as streams flow from node to node. We find that the send and receive tasks in Borealis take up significant CPU resources. We redesign the send and receive paths of Borealis by replacing TCP with a user-level protocol based on Myrinet MX. We then evaluate the CPU utilization and network throughput on a 10 Gbits/s network using both polling and interrupts for communicating data. Finally, we propose a sleep mechanism that avoids the CPU overheads associated with both interrupts and polling. We use a real setup consisting of four eight-core nodes equipped with 10 Gbits/s Ethernet and native Myrinet MX communication subsystems to examine the impact of our approach. Our results show that our approach saves CPU utilization for a range of workload conditions and is able to achieve better throughput compared to TCP with lower CPU utilization (upto 40%).

[1]  Toshiaki Yasue,et al.  Scalable performance of system S for extract-transform-load processing , 2010, SYSTOR '10.

[2]  Kun-Lung Wu,et al.  COLA: Optimizing Stream Processing Applications via Graph Partitioning , 2009, Middleware.

[3]  Charles L. Seitz,et al.  Myrinet: A Gigabit-per-Second Local Area Network , 1995, IEEE Micro.

[4]  Michael Stonebraker,et al.  Aurora: a new model and architecture for data stream management , 2003, The VLDB Journal.

[5]  Patrick Valduriez,et al.  StreamCloud: A Large Scale Data Streaming System , 2010, 2010 IEEE 30th International Conference on Distributed Computing Systems.

[6]  Jennifer Widom,et al.  The CQL continuous query language: semantic foundations and query execution , 2006, The VLDB Journal.

[7]  Ying Xing,et al.  Scalable Distributed Stream Processing , 2003, CIDR.

[8]  H.H.J. Hum,et al.  Polling Watchdog: Combining Polling and Interrupts for Efficient Message Handling , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[9]  Koushik Chakraborty,et al.  Hardware support for spin management in overcommitted virtual machines , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[10]  Nick Mitchell The big pileup , 2010, 2010 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS).

[11]  Kun-Lung Wu,et al.  SODA: An Optimizing Scheduler for Large-Scale Stream-Based Distributed Computer Systems , 2008, Middleware.

[12]  Eric Anderson,et al.  Efficiency matters! , 2010, OPSR.

[13]  Anand Sivasubramaniam,et al.  Worth their watts? - an empirical study of datacenter servers , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.

[14]  Julian Hyde Data in Flight , 2009, ACM Queue.

[15]  Qiang Chen,et al.  Aurora : a new model and architecture for data stream management ) , 2006 .

[16]  Christoforos E. Kozyrakis,et al.  On the energy (in)efficiency of Hadoop clusters , 2010, OPSR.

[17]  Wolf-Dietrich Weber,et al.  Power provisioning for a warehouse-sized computer , 2007, ISCA '07.

[18]  Philip S. Yu,et al.  SPADE: the system s declarative stream processing engine , 2008, SIGMOD Conference.