OS-Based NUMA Optimization: Tackling the Case of Truly Multi-thread Applications with Non-partitioned Virtual Page Accesses
暂无分享,去创建一个
[1] Alessandro Pellegrini,et al. The ROme OpTimistic Simulator: A Tutorial , 2013, Euro-Par Workshops.
[2] Ananta Tiwari,et al. PEBIL: binary instrumentation for practical data-intensive program analysis , 2013, Cluster Computing.
[3] Xiaofeng Gao,et al. Reducing overheads for acquiring dynamic memory traces , 2005, IEEE International. 2005 Proceedings of the IEEE Workload Characterization Symposium, 2005..
[4] Philippe Olivier Alexandre Navaux,et al. Optimizing Memory Locality Using a Locality-Aware Page Table , 2014, 2014 IEEE 26th International Symposium on Computer Architecture and High Performance Computing.
[5] Roberto Vitali,et al. Autonomic State Management for Optimistic Simulation Platforms , 2015, IEEE Transactions on Parallel and Distributed Systems.
[6] Alessandro Pellegrini,et al. NUMA Time Warp , 2015, SIGSIM-PADS.
[7] Richard M. Fujimoto,et al. Adaptive memory management and optimism control in time warp , 1997, TOMC.
[8] R. Fujimoto,et al. Buffer management in shared-memory time warp systems , 1995, Proceedings 9th Workshop on Parallel and Distributed Simulation (ACM/IEEE).
[9] Frank Mueller,et al. Hardware profile-guided automatic page placement for ccNUMA systems , 2006, PPoPP '06.
[10] Zizhong Chen,et al. Optimizing Process-to-Core Mappings for Application Level Multi-dimensional MPI Communications , 2012, 2012 IEEE International Conference on Cluster Computing.
[11] Henri Casanova,et al. On cluster resource allocation for multiple parallel task graphs , 2010, J. Parallel Distributed Comput..
[12] Jack J. Dongarra,et al. EZTrace: A Generic Framework for Performance Analysis , 2011, 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.
[13] Philippe Olivier Alexandre Navaux,et al. Evaluating Thread Placement Based on Memory Access Patterns for Multi-core Processors , 2010, 2010 IEEE 12th International Conference on High Performance Computing and Communications (HPCC).
[14] Xiaofeng Gao,et al. ALITER: an asynchronous lightweight instrumentation tool for event recording , 2005, CARN.
[15] Philippe Olivier Alexandre Navaux,et al. kMAF: Automatic kernel-level management of thread and data affinity , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).
[16] Laura Hoch. Understanding The Linux Virtual Memory Manager , 2016 .
[17] David W. Nellans,et al. Handling the problems and opportunities posed by multiple on-chip memory controllers , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).
[18] Fernando Magno Quintão Pereira,et al. Compiler support for selective page migration in NUMA architectures , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).
[19] Simon W. Moore,et al. A communication characterisation of Splash-2 and Parsec , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[20] Nael B. Abu-Ghazaleh,et al. Parallel Discrete Event Simulation for Multi-Core Systems: Analysis and Optimization , 2014, IEEE Transactions on Parallel and Distributed Systems.
[21] Vivien Quéma,et al. Traffic management: a holistic approach to memory placement on NUMA systems , 2013, ASPLOS '13.
[22] Wenguang Chen,et al. MPIPP: an automatic profile-guided parallel process placement toolset for SMP clusters and multiclusters , 2006, ICS '06.
[23] Roberto Vitali,et al. Load sharing for optimistic parallel simulations on multi core machines , 2012, PERV.
[24] Nael B. Abu-Ghazaleh,et al. Optimization of Parallel Discrete Event Simulator for Multi-core Systems , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.
[25] Danny Hendler,et al. Exploiting Locality in Lease-Based Replicated Transactional Memory via Task Migration , 2013, DISC.
[26] Jeffrey K. Hollingsworth,et al. Hardware monitors for dynamic page migration , 2008, J. Parallel Distributed Comput..