Understanding Real-World Concurrency Bugs in Go

Go is a statically-typed programming language that aims to provide a simple, efficient, and safe way to build multi-threaded software. Since its creation in 2009, Go has matured and gained significant adoption in production and open-source software. Go advocates for the usage of message passing as the means of inter-thread communication and provides several new concurrency mechanisms and libraries to ease multi-threading programming. It is important to understand the implication of these new proposals and the comparison of message passing and shared memory synchronization in terms of program errors, or bugs. Unfortunately, as far as we know, there has been no study on Go's concurrency bugs. In this paper, we perform the first systematic study on concurrency bugs in real Go programs. We studied six popular Go software including Docker, Kubernetes, and gRPC. We analyzed 171 concurrency bugs in total, with more than half of them caused by non-traditional, Go-specific problems. Apart from root causes of these bugs, we also studied their fixes, performed experiments to reproduce them, and evaluated them with two publicly-available Go bug detectors. Overall, our study provides a better understanding on Go's concurrency models and can guide future researchers and practitioners in writing better, more reliable Go software and in developing debugging and diagnosis tools for Go.

[1]  Andrea C. Arpaci-Dusseau,et al.  A Study of Linux File System Evolution , 2013, FAST.

[2]  Peter Thiemann,et al.  Static Trace-Based Deadlock Analysis for Synchronous Mini-Go , 2016, APLAS.

[3]  Jianjun Zhao,et al.  JaConTeBe: A Benchmark Suite of Real-World Java Concurrency Bugs (T) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[4]  Shan Lu,et al.  TaxDC: A Taxonomy of Non-Deterministic Concurrency Bugs in Datacenter Distributed Systems , 2016, ASPLOS.

[5]  Yuanyuan Zhou,et al.  Learning from mistakes: a comprehensive study on real world concurrency bug characteristics , 2008, ASPLOS.

[6]  Jie Wang,et al.  A comprehensive study on real world concurrency bugs in Node.js , 2017, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[7]  Horatiu Jula,et al.  Deadlock Immunity: Enabling Systems to Defend Against Deadlocks , 2008, OSDI.

[8]  Sebastian Burckhardt,et al.  Effective Data-Race Detection for the Kernel , 2010, OSDI.

[9]  Shan Lu,et al.  Instrumentation and sampling strategies for cooperative concurrency bug isolation , 2010, OOPSLA.

[10]  Bernardo Toninho,et al.  Fencing off go: liveness and safety for channel-based programming , 2016, POPL.

[11]  Leslie Lamport,et al.  Concurrent reading and writing , 1977, Commun. ACM.

[12]  Dawson R. Engler,et al.  RacerX: effective, static detection of race conditions and deadlocks , 2003, SOSP '03.

[13]  Shan Lu,et al.  Understanding and detecting real-world performance bugs , 2012, PLDI.

[14]  Satish Narayanasamy,et al.  A case for an interleaving constrained shared-memory multi-processor , 2009, ISCA '09.

[15]  Scott A. Mahlke,et al.  The theory of deadlock avoidance via discrete control , 2009, POPL '09.

[16]  W. K. Chan,et al.  Magiclock: Scalable Detection of Potential Deadlocks in Large-Scale Multithreaded Programs , 2014, IEEE Transactions on Software Engineering.

[17]  Wei Zhang,et al.  Automated Concurrency-Bug Fixing , 2012, OSDI.

[18]  Daniel Kroening,et al.  Sound static deadlock analysis for C/Pthreads , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[19]  Zhiqiang Ma,et al.  Ad Hoc Synchronization Considered Harmful , 2010, OSDI.

[20]  Shan Lu,et al.  What change history tells us about thread synchronization , 2015, ESEC/SIGSOFT FSE.

[21]  C. A. R. Hoare,et al.  Communicating sequential processes , 1978, CACM.

[22]  Salvatore La Torre,et al.  Lazy-CSeq: A Context-Bounded Model Checking Tool for Multi-threaded C-Programs , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[23]  Tanakorn Leesatapornwongsa,et al.  What Bugs Live in the Cloud? A Study of 3000+ Issues in Cloud Systems , 2014, SoCC.

[24]  Nobuko Yoshida,et al.  Static deadlock detection for concurrent go by global session graph synthesis , 2016, CC.

[25]  Koushik Sen,et al.  A randomized dynamic program analysis technique for detecting real deadlocks , 2009, PLDI '09.

[26]  Junfeng Yang,et al.  An empirical study of operating systems errors , 2001, SOSP.

[27]  Scott A. Mahlke,et al.  Gadara: Dynamic Deadlock Avoidance for Multithreaded Programs , 2008, OSDI.

[28]  Kedar S. Namjoshi Are Concurrent Programs That Are Easier to Write Also Easier to Check? , 2008 .

[29]  Yuanyuan Zhou,et al.  Have things changed now?: an empirical study of bug characteristics in modern open source software , 2006, ASID '06.

[30]  Shan Lu,et al.  Leveraging the short-term memory of hardware to diagnose production-run software failures , 2014, ASPLOS.

[31]  Konstantin Serebryany,et al.  ThreadSanitizer: data race detection in practice , 2009, WBIA '09.

[32]  Shan Lu,et al.  Production-run software failure diagnosis via hardware performance counters , 2013, ASPLOS '13.

[33]  Brandon Lucia,et al.  Finding concurrency bugs with context-aware communication graphs , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[34]  Qi Gao,et al.  2ndStrike: toward manifesting hidden concurrency typestate bugs , 2011, ASPLOS XVI.

[35]  Vivek K. Shanbhag Deadlock-Detection in Java-Library Using Static-Analysis , 2008, 2008 15th Asia-Pacific Software Engineering Conference.

[36]  Francesco Sorrentino,et al.  PickLock: A Deadlock Prediction Approach under Nested Locking , 2015, SPIN.

[37]  Stephen N. Freund,et al.  Atomizer: a dynamic atomicity checker for multithreaded programs , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[38]  David Lie,et al.  Kivati: fast detection and prevention of atomicity violations , 2010, EuroSys '10.

[39]  Michael Burrows,et al.  Eraser: a dynamic data race detector for multithreaded programs , 1997, TOCS.

[40]  Shan Lu,et al.  Automated atomicity-violation fixing , 2011, PLDI '11.

[41]  Yuanyuan Zhou,et al.  AVIO: Detecting Atomicity Violations via Access-Interleaving Invariants , 2007, IEEE Micro.

[42]  Shan Lu,et al.  Understanding and generating high quality patches for concurrency bugs , 2016, SIGSOFT FSE.

[43]  Shan Lu,et al.  ConMem: detecting severe concurrency bugs through an effect-oriented approach , 2010, ASPLOS XV.

[44]  Bernardo Toninho,et al.  A Static Verification Framework for Message Passing in Go Using Behavioural Types , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).