In today's data centers supporting Internet-scale computing and I/O services, increasingly more network-intensive applications are deployed on the network as a service. To this end, it is critical for the applications to quickly retrieve requests from the network and send their responses to the network. To facilitate this network function, operating system usually provides an event notification mechanism so that the applications (or the library) know if the network is ready to supply data for them to read or to receive data for them to write. As a widely used and representative notification mechanism, epoll in Linux provides a scalable and high-performance implementation by allowing applications to specifically indicate which connections and what events on them need to be watched. As epoll has been used in some major systems, including KV systems, such as Redis and Memcached, and web server systems such as NGINX, we have identified a substantial performance issue in its use. For the sake of efficiency, applications usually use epoll's system calls to inform the kernel exactly of what events they are interested in and always keep the information up-to-date. However, in a system with demanding network traffic, such a rigid maintenance of the information is not necessary and the excess number of system calls for this purpose can substantially degrade the system's performance. In this paper, we use Redis as an example to explore the issue. We propose a strategy of informing the kernel of the interest events in a manner adaptive to the current network load, so that the epoll system calls can be reduced and the events can be efficiently delivered. We have implemented the strategy, named as FlexPoll, in Redis without modifying any kernel code. Our evaluation on Redis shows that the query throughput can be improved by up to 46.9% on micro benchmarks, and even up to 67.8% on workloads emulating real-world access patterns. FlexPoll can be extended to other applications and event libraries built on the epoll mechanism in a straightforward manner.
[1]
Michael Stumm,et al.
FlexSC: Flexible System Call Scheduling with Exception-Less System Calls
,
2010,
OSDI.
[2]
Michael Stumm,et al.
Exception-Less System Calls for Event-Driven Servers
,
2011,
USENIX Annual Technical Conference.
[3]
Neal Leavitt,et al.
Will NoSQL Databases Live Up to Their Promise?
,
2010,
Computer.
[4]
Chris Douglas,et al.
Walnut: a unified cloud object store
,
2012,
SIGMOD Conference.
[5]
Song Jiang,et al.
Workload analysis of a large-scale key-value store
,
2012,
SIGMETRICS '12.
[6]
Werner Vogels,et al.
Dynamo: amazon's highly available key-value store
,
2007,
SOSP.