Dynamic I/O-Aware Scheduling for Batch-Mode Applications on Chip Multiprocessor Systems of Cluster Platforms
暂无分享,去创建一个
Lei Liu | Lei Wang | Xiaobing Feng | Fang Lu | Pen-Chung Yew | Chenggang Wu | Huimin Cui
[1] Qing Yi,et al. Layout-oblivious compiler optimization for matrix computations , 2013, TACO.
[2] Randy H. Katz,et al. Above the Clouds: A Berkeley View of Cloud Computing , 2009 .
[3] Joel H. Saltz,et al. Tuning the performance of I/O-intensive parallel applications , 1996, IOPADS '96.
[4] Christoph Lameter,et al. Local and Remote Memory: Memory in a Linux/NUMA System , 2006 .
[5] Xiaobing Feng,et al. Software-Hardware Cooperative DRAM Bank Partitioning for Chip Multiprocessors , 2010, NPC.
[6] A. Snavely,et al. Symbiotic jobscheduling for a simultaneous mutlithreading processor , 2000, SIGP.
[7] Z. Lin,et al. Parallelizing I/O intensive applications for a workstation cluster: a case study , 1993, CARN.
[8] Ravi Jain,et al. Scheduling Parallel I/O Operations in Multiple Bus Systems , 1992, J. Parallel Distributed Comput..
[9] Thomas R. Gross,et al. Memory management in NUMA multicore systems: trapped between cache contention and interconnect overhead , 2011, ISMM '11.
[10] Lin Gao,et al. Loop recreation for thread‐level speculation on multicore processors , 2010, Softw. Pract. Exp..
[11] Alok Choudhary,et al. Exploiting Shared Memory to Improve Parallel I/O Performance , 2006, PVM/MPI.
[12] Dongrui Fan,et al. Extendable pattern-oriented optimization directives , 2012, International Symposium on Code Generation and Optimization (CGO 2011).
[13] Lingjia Tang,et al. Compiling for niceness: mitigating contention for QoS in warehouse scale computers , 2012, CGO '12.
[14] Chita R. Das,et al. Towards characterizing cloud backend workloads: insights from Google compute clusters , 2010, PERV.
[15] Pen-Chung Yew,et al. On mitigating memory bandwidth contention through bandwidth-aware scheduling , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).
[16] Kyung D. Ryu,et al. Efficient Network and I/O Throttling for Fine-Grain Cycle Stealing , 2001, ACM/IEEE SC 2001 Conference (SC'01).
[17] 김성찬,et al. 고 변환이득 및 격리 특성의 V-band용 4체배 Sub-harmonic Mixer , 2003 .
[18] Rajeev Thakur,et al. Data sieving and collective I/O in ROMIO , 1998, Proceedings. Frontiers '99. Seventh Symposium on the Frontiers of Massively Parallel Computation.
[19] Lin Gao,et al. Exploiting Speculative TLP in Recursive Programs by Dynamic Thread Prediction , 2009, CC.
[20] Luiz André Barroso,et al. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines , 2009, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines.
[21] Alan L. Cox,et al. Scheduling I/O in virtual machine monitors , 2008, VEE '08.
[22] Li Chen,et al. PARBLO: Page-Allocation-Based DRAM Row Buffer Locality Optimization , 2009, Journal of Computer Science and Technology.
[23] Anand Sivasubramaniam,et al. Xen and co.: communication-aware CPU scheduling for consolidated xen-based hosting platforms , 2007, VEE '07.
[24] Manuel Prieto,et al. Survey of scheduling techniques for addressing shared resources in multicore processors , 2012, CSUR.
[25] Kevin Skadron,et al. Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[26] Alexandra Fedorova,et al. Addressing shared resource contention in multicore processors via scheduling , 2010, ASPLOS XV.
[27] Lin Gao,et al. Thread-Sensitive Modulo Scheduling for Multicore Processors , 2008, 2008 37th International Conference on Parallel Processing.
[28] Ioan Raicu,et al. I/O Throttling and Coordination for MapReduce , 2012 .
[29] Kai Shen,et al. FIOS: a fair, efficient flash I/O scheduler , 2012, FAST.
[30] Xiaobing Feng,et al. An empirical model for predicting cross-core performance interference on multicore processors , 2013, Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques.
[31] Chen Ding,et al. Defensive loop tiling for shared cache , 2013, Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).
[32] Yang Yang,et al. Automatic Library Generation for BLAS3 on GPUs , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.
[33] Devarshi Ghoshal,et al. I/O performance of virtualized cloud environments , 2011, DataCloud-SC '11.
[34] Ravi Jain,et al. Heuristics for Scheduling I/O Operations , 1997, IEEE Trans. Parallel Distributed Syst..
[35] Lin Gao,et al. Loop recreation for thread-level speculation , 2007, 2007 International Conference on Parallel and Distributed Systems.
[36] Jie Chen,et al. Analysis and approximation of optimal co-scheduling on Chip Multiprocessors , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[37] Luiz André Barroso,et al. The Case for Energy-Proportional Computing , 2007, Computer.