Feedback-Directed Optimizations in GCC with Estimated Edge Profiles from Hardware Event Sampling

Traditional feedback-directed optimization (FDO) in GCC uses static instrumentation to collect edge and value profiles. This method has shown good application performance gains, but is not commonly used in practice due to the high runtime overhead of profile collection, the tedious dual-compile usage model, and difficulties in generating representative training data sets. In this paper, we show that edge frequency estimates can be successfully constructed with heuristics using profile data collected by sampling of hardware events, incurring low runtime overhead (e.g., less then 2%), and requiring no instrumentation, yet achieving competitive performance gains. We describe the motivation, design, and implementation of FDO using sample profiles in GCC and also present our initial experimental results with SPEC2000int C benchmarks that show approximately 70% to 90% of the performance gains obtained using traditional FDO with exact edge profiles.