On the Performance of Tagged Translation Lookaside Buffers: A Simulation-Driven Analysis

Recent virtualization-driven CPU architectural extensions involve tagging the hardware-managed Translation Look aside Buffer (TLB) entries to avoid TLB flushes during context switches, thereby sharing the TLB among multiple address spaces. While tagged TLBs are expected to improve the performance of virtualized workloads, a systematic evaluation of this improvement, its dependence on TLB and workload related factors and the performance implications of the contention arising from TLB sharing are yet to be investigated. This paper undertakes these investigations using a simulation-driven approach. We develop a simulation model for the tagged TLB and integrate it into a full-system simulation framework. Using this model, we show that the performance impact of using tagged TLBs ranges from 1% to 25% and is highly dependent on the size of the TLB, the TLB miss penalty and the nature of the workload and the type of tag used. The performance of consolidated workloads is also simulated and the observations from these simulations are used to highlight the performance variation due to resource contention in the shared TLB. Isolating the TLB behavior of one application in a consolidated workload from these variations due to the TLB contention by means of a static TLB usage control scheme is also explored. Furthermore, we show that the performance improvement due to tagged TLBs can be further increased by 1.4X for selected high-priority applications, by restricting the TLB usage of other low-priority workloads, in a consolidated workload scenario.

[1]  Samuel Thibault,et al.  Improving performance by embedding HPC applications in lightweight Xen domains , 2008, HPCVirt '08.

[2]  Gil Neiger,et al.  IntelŴVirtualization Technology: Hardware Support for Efficient Processor Virtualization , 2006 .

[3]  Fredrik Larsson,et al.  Simics: A Full System Simulation Platform , 2002, Computer.

[4]  Richard McDougall,et al.  Virtualization performance: perspectives and challenges ahead , 2010, OPSR.

[5]  Srilatha Manne,et al.  Accelerating two-dimensional page walks for virtualized systems , 2008, ASPLOS.

[6]  Renato J. O. Figueiredo,et al.  A Simulation Framework for the Analysis of the TLB Behavior in Virtualized Environments , 2010, 2010 IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[7]  Dhabaleswar K. Panda,et al.  Virtual machine aware communication libraries for high performance computing , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[8]  Michel Dubois,et al.  Moving address translation closer to memory in distributed shared-memory multiprocessors , 2005, IEEE Transactions on Parallel and Distributed Systems.

[9]  Erik Hagersten,et al.  Memory system behavior of Java-based middleware , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..

[10]  Renato Figueiredo,et al.  Provisioning wide-area virtual environments through i/o interposition: the redirect-on-write file system and characterization of i/o overheads in a virtualized platform , 2008 .

[11]  Omesh Tickoo,et al.  qTLB: Looking Inside the Look-Aside Buffer , 2007, HiPC.

[12]  Diego R. Llanos Ferraris,et al.  TPCC-UVa: an open-source TPC-C implementation for global performance measurement of computer systems , 2006, SGMD.

[13]  Rajeev Balasubramonian,et al.  Non-uniform power access in large caches with low-swing wires , 2009, 2009 International Conference on High Performance Computing (HiPC).

[14]  Renato J. O. Figueiredo,et al.  TMT: A TLB Tag Management Framework for Virtualized Platforms , 2009, 2009 21st International Symposium on Computer Architecture and High Performance Computing.

[15]  David K. Y. Yau,et al.  A hash-TLB approach for MMU virtualization in xen/IA64 , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.