Spin lock killed the performance star

The Internet has grown quite quickly, requiring more and more processing power each year to handle user requests in a timely fashion. In the multicore world, the addition of server-side threads should help improve server performance. However, several studies have shown that this is not true, identifying the Linux kernel as the possible culprit. Our working hypothesis is that the kernel does not provide a scalable interface for network communications. Through various tests, we narrowed the problem down to the implementation of the spin lock mechanism (a synchronization structure used mostly at the kernel level), which has been inherited from early versions of the Linux kernel. It is only now, with the emergence of multicore architectures, that users have begun to notice the performance hit that the existing spin lock implementation has on parallel systems, especially in multithreaded network protocols. Thus, our recommendation is that spin locks be redesigned so that the full power of multicore systems can be harnessed.

[1]  Robert Tappan Morris,et al.  An Analysis of Linux Scalability to Many Cores , 2010, OSDI.

[2]  Javier Bustos-Jiménez,et al.  Analysis of Linux UDP Sockets Concurrent Performance , 2014, 2014 33rd International Conference of the Chilean Computer Science Society (SCCC).

[3]  Larry L. Peterson,et al.  The x-Kernel: An Architecture for Implementing Network Protocols , 1991, IEEE Trans. Software Eng..

[4]  Alan L. Cox,et al.  An Evaluation of Network Stack Parallelization Strategies in Modern Operating Systems , 2006, USENIX Annual Technical Conference, General Track.

[5]  Janak H. Patel,et al.  A low-overhead coherence solution for multiprocessors with private cache memories , 1984, ISCA '84.

[6]  Mats Björkman,et al.  Locking Effects in Multiprocessor Implementations of Protocols , 1993, SIGCOMM.

[7]  Douglas C. Schmidt,et al.  Measuring the performance of parallel message-based process architectures , 1995, Proceedings of INFOCOM'95.

[8]  Erich M. Nahum,et al.  Performance issues in parallelized network protocols , 1994, OSDI '94.

[9]  Paul Vixie,et al.  Implementation and Evaluation of Moderate Parallelism in the BIND9 DNS Server , 2006, USENIX Annual Technical Conference, General Track.

[10]  Byung-Gon Chun,et al.  Usenix Association 10th Usenix Symposium on Operating Systems Design and Implementation (osdi '12) 135 Megapipe: a New Programming Interface for Scalable Network I/o , 2022 .

[11]  Weng-Fai Wong,et al.  Dynamic cache contention detection in multi-threaded applications , 2011, VEE '11.