Improving IBM POWER8 Performance Through Symbiotic Job Scheduling
暂无分享,去创建一个
[1] Kevin Skadron,et al. Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[2] Dirk Grunwald,et al. Methods for modeling resource contention on simultaneous multithreading processors , 2005, 2005 International Conference on Computer Design.
[3] Dean M. Tullsen,et al. Symbiotic jobscheduling for a simultaneous mutlithreading processor , 2000, SIGP.
[4] Jie Chen,et al. Analysis and approximation of optimal co-scheduling on Chip Multiprocessors , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[5] Andrzej Nowak,et al. Hierarchical cycle accounting: a new method for application performance tuning , 2015, 2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[6] José Duato,et al. L1-bandwidth aware thread allocation in multicore SMT processors , 2013, Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques.
[7] Stijn Eyerman,et al. Per-thread cycle accounting in SMT processors , 2009, ASPLOS.
[8] Michael Gschwind,et al. IBM POWER8 processor core microarchitecture , 2015, IBM J. Res. Dev..
[9] Kevin Skadron,et al. Performance, energy, and thermal considerations for SMT and CMP architectures , 2005, 11th International Symposium on High-Performance Computer Architecture.
[10] Lingjia Tang,et al. SMiTe: Precise QoS Prediction on Real-System SMT Processors to Improve Utilization in Warehouse Scale Computers , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[11] Dean M. Tullsen,et al. Handling long-latency loads in a simultaneous multithreading processor , 2001, Proceedings. 34th ACM/IEEE International Symposium on Microarchitecture. MICRO-34.
[12] Stijn Eyerman,et al. System-Level Performance Metrics for Multiprogram Workloads , 2008, IEEE Micro.
[13] Ananta Tiwari,et al. Making the Most of SMT in HPC , 2014, ACM Trans. Archit. Code Optim..
[14] Stijn Eyerman,et al. Symbiotic job scheduling on the IBM POWER8 , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[15] Sebastien Hily,et al. Contention on 2nd Level Cache May Limit the Effectiveness of Simultaneous Multithreading , 1997 .
[16] Francisco J. Cazorla,et al. Thread Assignment of Multithreaded Network Applications in Multicore/Multithreaded Processors , 2013, IEEE Transactions on Parallel and Distributed Systems.
[17] Alex Settle,et al. Architectural Support for Enhanced SMT Job Scheduling , 2004, IEEE PACT.
[18] Carl Staelin,et al. lmbench: Portable Tools for Performance Analysis , 1996, USENIX Annual Technical Conference.
[19] Stijn Eyerman,et al. Probabilistic job symbiosis modeling for SMT processor scheduling , 2010, ASPLOS XV.
[20] Stijn Eyerman,et al. The benefit of SMT in the multi-core era: flexibility towards degrees of thread-level parallelism , 2014, ASPLOS.
[21] Jack Edmonds,et al. Maximum matching and a polyhedron with 0,1-vertices , 1965 .
[22] Jean-Luc Gaudiot,et al. SMT Layout Overhead and Scalability , 2002, IEEE Trans. Parallel Distributed Syst..
[23] Dean M. Tullsen,et al. Simultaneous multithreading: Maximizing on-chip parallelism , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.