Evaluating Overhead and Contention in Concurrent Accesses to a Graph

The current spread of multicore processors reinforces the need for strategies to implement mutithreaded programs. Since using synchronization methods to coordinate the access to shared data introduces contention, finding new strategies to implement concurrent data structures can lead to performance gains. This paper introduces a case study in which a graph data structure is implemented using low contention strategies: one based on low level atomic operations, one based on mutexes and another using transactional memory. Results show that the first presents better performance, the second the worst performance and the later a higher level of abstraction for programmers with a similar performance to the first.

[1]  Mohammad Farook,et al.  Managing Long Linked Lists Using Lock-Free Techniques , 1998 .

[2]  Marco Danelutto,et al.  FastFlow: High-level and Efficient Streaming on Multi-core , 2017 .

[3]  Nir Shavit,et al.  Understanding Tradeoffs in Software Transactional Memory , 2007, International Symposium on Code Generation and Optimization (CGO'07).

[4]  Samy Al-Bahra Nonblocking algorithms and scalable multicore programming , 2013, CACM.

[5]  Luciano Paschoal Gaspary,et al.  Anahy: A Programming Environment for Cluster Computing , 2006, VECPAR.

[6]  Torvald Riegel,et al.  Dynamic performance tuning of word-based software transactional memory , 2008, PPoPP.

[7]  Michael B. Greenwald,et al.  Two-handed emulation: how to build non-blocking implementations of complex data-structures using DCAS , 2002, PODC '02.

[8]  Gerson G. H. Cavalheiro,et al.  An Efficient Parallel Algorithm to Evolve Simulations of the Cellular Potts Model , 2005, Parallel Process. Lett..

[9]  George Ho,et al.  PAPI: A Portable Interface to Hardware Performance Counters , 1999 .

[10]  Silas Boyd-Wickizer,et al.  Using memory mapping to support cactus stacks in work-stealing runtime systems , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).

[11]  Anthony Skjellum,et al.  A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard , 1996, Parallel Comput..

[12]  Gerson G. H. Cavalheiro,et al.  Utilization of data structure of low contention in execution core of Anahy-3 , 2014, 2014 XL Latin American Computing Conference (CLEI).

[13]  James R. Larus,et al.  Transactional Memory, 2nd edition , 2010, Transactional Memory.