Scalable Cache-Optimized Concurrent FIFO Queue for Multicore Architectures

A concurrent FIFO queue is a widely used fundamental data structure for parallelizing software. In this letter, we introduce a novel concurrent FIFO queue algorithm for multicore architecture. We achieve better scalability by reducing contention among concurrent threads, and improve performance by optimizing cache-line usage. Experimental results on a server with eight cores show that our algorithm outperforms state-ofthe-art algorithms by a factor of two. key words: FIFO queue, multicore processor, cache-line contention, compare-and-swap, fetch-and-store