Swift is a modern multi-paradigm programming language with an extensive developer community and open source ecosystem. Swift 3's memory management strategy is based on Automatic Reference Counting (ARC) augmented with unsafe APIs for manually-managed memory. We have seen ARC consume as much as 80% of program execution time. A significant portion of ARC's direct performance cost can be attributed to its use of atomic machine instructions to protect reference count updates from data races. Consequently, we have designed and implemented dynamic atomicity, an optimization which safely replaces atomic reference-counting operations with nonatomic ones where feasible. The optimization introduces a store barrier to detect possibly intra-thread references, compiler-generated recursive reference-tracers to find all affected objects, and a bit of state in each reference count to encode its atomicity requirements. Using a suite of 171 microbenchmarks, 9 programs from the Computer Language Benchmarks Game, and the Richards benchmark, we performed a limit study by unsafely making all reference counting operations nonatomic. We measured potential speedups of up to 220% on the microbenchmarks, 120% on the Benchmarks Game and 70% on Richards. By automatically reducing ARC overhead, our optimization both improves Swift 3's performance and reduces the temptation for performance-oriented programmers to resort to unsafe manual memory management. Furthermore, the machinery implemented for dynamic atomicity could also be employed to obtain cheaper thread-safe Swift data structures, or to augment ARC with optional cycle detection or a backup tracing garbage collector.
[1]
Hanspeter Mössenböck,et al.
Efficient and thread-safe objects for dynamically-typed languages
,
2016,
OOPSLA.
[2]
Pramod G. Joisha,et al.
Compiler optimizations for nondeferred reference: counting garbage collection
,
2006,
ISMM '06.
[3]
James Goodwill,et al.
The Swift Programming Language
,
2015
.
[4]
David M. Ungar,et al.
Generation Scavenging: A non-disruptive high performance storage reclamation algorithm
,
1984,
SDE 1.
[5]
Daniel G. Bobrow,et al.
An efficient, incremental, automatic garbage collector
,
1976,
CACM.
[6]
Kathryn S. McKinley,et al.
Ulterior reference counting: fast garbage collection without a long wait
,
2003,
OOPSLA '03.
[7]
Kathryn S. McKinley,et al.
Ulterior reference counting: fast garbage collection without a long wait
,
2003,
OOPSLA 2003.
[8]
Erez Petrank,et al.
Thread-local heaps for Java
,
2002,
ISMM '02.
[9]
Richard E. Jones,et al.
The Garbage Collection Handbook: The art of automatic memory management
,
2011,
Chapman and Hall / CRC Applied Algorithms and Data Structures Series.