Understanding and Addressing Blocking-Induced Network Server Latency

We investigate the origin and components of network server latency under various loads and find that filesystem-related kernel queues exhibit head-of-line blocking, which leads to bursty behavior in event delivery and process scheduling. In turn, these problems degrade the existing fairness and scheduling policies in the operating system, causing requests that could have been served in memory, with low latency, to unnecessarily wait on disk-bound requests. While this batching behavior only mildly affects throughput, it severely degrades latency. This problem manifests itself in fairness and service quality degradation, a phenomenon we call service inversion. We show a portable solution that avoids these problems without kernel or filesystem modifications, We modify two different Web servers to use this approach, and demonstrate a qualitatively different change in their latency profiles, generating more than an order of magnitude reduction in latency. The resulting systems are able to serve most requests without being tied to disk performance, and they scale better with improvements in processor speed. These results are not dependent on server software architecture, and can be profitably applied to experimental and production servers.

[1]  Eran Gabber,et al.  Storage Management for Web Proxies , 2001, USENIX Annual Technical Conference, General Track.

[2]  David E. Culler,et al.  SEDA: an architecture for well-conditioned, scalable internet services , 2001, SOSP.

[3]  Raj Jain Congestion Control and Traffic Management in ATM Networks: Recent Advances and a Survey , 1996, Comput. Networks ISDN Syst..

[4]  James R. Goodman,et al.  Transactional lock-free execution of lock-based programs , 2002, ASPLOS X.

[5]  Carl Staelin,et al.  lmbench: Portable Tools for Performance Analysis , 1996, USENIX Annual Technical Conference.

[6]  Mor Harchol-Balter,et al.  Size-based scheduling to improve web performance , 2003, TOCS.

[7]  Geoffrey M. Voelker,et al.  Whole Page Performance , 2002 .

[8]  Evangelos P. Markatos,et al.  Secondary Storage Management for Web Proxies , 1999, USENIX Symposium on Internet Technologies and Systems.

[9]  Willy Zwaenepoel,et al.  Flash: An efficient and portable Web server , 1999, USENIX Annual Technical Conference, General Track.

[10]  Douglas C. Schmidt,et al.  Measuring the impact of event dispatching and concurrency models on Web server performance over high-speed networks , 1997, GLOBECOM 97. IEEE Global Telecommunications Conference. Conference Record.

[11]  George C. Necula,et al.  Capriccio: scalable threads for internet services , 2003, SOSP '03.

[12]  E. N. Elnozahy,et al.  Measuring Client-Perceived Response Time on the WWW , 2001, USITS.

[13]  Cruz Izu,et al.  Impact of the Head-of-Line Blocking on Parallel Computer Networks: Hardware to Applications , 1999, Euro-Par.

[14]  Mor Harchol-Balter,et al.  Analysis of SRPT scheduling: investigating unfairness , 2001, SIGMETRICS '01.

[15]  James R. Larus,et al.  Using Cohort-Scheduling to Enhance Server Performance , 2002, USENIX Annual Technical Conference, General Track.

[16]  Michael Jurczyk,et al.  Phenomenon of Higher Order Head-of-Line Blocking in Multistage Interconnection Networks under Nonuniform Traffic Patterns (Special Issue on Architectures, Algorithms and Networks for Massively Parallel Computing) , 1996 .

[17]  Mor Harchol-Balter,et al.  Connection Scheduling in Web Servers , 1999, USENIX Symposium on Internet Technologies and Systems.

[18]  Vivek S. Pai,et al.  Proceedings of the General Track: 2004 Usenix Annual Technical Conference Making the " Box " Transparent: System Call Performance as a First-class Result , 2022 .