Dissecting the CUDA scheduling hierarchy: a Performance and Predictability Perspective
暂无分享,去创建一个
Nicola Capodieci | Marko Bertogna | Andrea Marongiu | Ignacio Sañudo Olmedo | Jorge Luis Martinez | A. Marongiu | M. Bertogna | Nicola Capodieci | Jorge Martinez
[1] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[2] Joseph Zambreno,et al. Increasing GPU throughput using kernel interleaved thread block scheduling , 2013, 2013 IEEE 31st International Conference on Computer Design (ICCD).
[3] R. Govindarajan,et al. Improving GPGPU concurrency with elastic kernels , 2013, ASPLOS '13.
[4] Nicola Capodieci,et al. Deadline-Based Scheduling for GPU with Preemption Support , 2018, 2018 IEEE Real-Time Systems Symposium (RTSS).
[5] Nicola Capodieci,et al. Memory interference characterization between CPU cores and integrated GPUs in mixed-criticality platforms , 2017, 2017 22nd IEEE International Conference on Emerging Technologies and Factory Automation (ETFA).
[6] Kevin Skadron,et al. Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[7] Esteban Walter Gonzalez Clua,et al. Maximizing the GPU resource usage by reordering concurrent kernels submission , 2019, Concurr. Comput. Pract. Exp..
[8] Nicola Capodieci,et al. Work-in-Progress: NVIDIA GPU Scheduling Details in Virtualized Environments , 2018, 2018 International Conference on Embedded Software (EMSOFT).
[9] Hyeran Jeon,et al. Tango: A Deep Neural Network Benchmark Suite for Various Accelerators , 2019, 2019 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[10] Ming Yang,et al. GPU Scheduling on the NVIDIA TX2: Hidden Details Revealed , 2017, 2017 IEEE Real-Time Systems Symposium (RTSS).
[11] Marco Maggioni,et al. Dissecting the NVIDIA Volta GPU Architecture via Microbenchmarking , 2018, ArXiv.
[12] Ajay Jain,et al. Dynamic Space-Time Scheduling for GPU Inference , 2018, ArXiv.
[13] Jianlong Zhong,et al. Kernelet: High-Throughput GPU Kernel Executions with Dynamic Slicing and Scheduling , 2013, IEEE Transactions on Parallel and Distributed Systems.
[14] Paolo Valente,et al. SiGAMMA: server based integrated GPU arbitration mechanism for memory accesses , 2017, RTNS.
[15] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[16] Francisco J. Cazorla,et al. Generating and Exploiting Deep Learning Variants to Increase Heterogeneous Resource Utilization in the NVIDIA Xavier , 2019, ECRTS.
[17] Hao Li,et al. Performance modeling in CUDA streams — A means for high-throughput data processing , 2014, 2014 IEEE International Conference on Big Data (Big Data).
[18] Henry Wong,et al. Analyzing CUDA workloads using a detailed GPU simulator , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.
[19] Wen-mei W. Hwu,et al. Parboil: A Revised Benchmark Suite for Scientific and Commercial Throughput Computing , 2012 .
[20] Ming Yang,et al. Avoiding Pitfalls when Using NVIDIA GPUs for Real-Time Tasks in Autonomous Systems , 2018, ECRTS.
[21] Nanning Zheng,et al. Accelerate GPU Concurrent Kernel Execution by Mitigating Memory Pipeline Stalls , 2018, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[22] Hadi Sadoghi Yazdi,et al. cCUDA: Effective Co-Scheduling of Concurrent Kernels on GPUs , 2020, IEEE Transactions on Parallel and Distributed Systems.
[23] R. Govindarajan,et al. Preemptive thread block scheduling with online structural runtime prediction for concurrent GPGPU kernels , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).
[24] Gert-Jan van den Braak,et al. Analysis and Modeling of the Timing Behavior of GPU Architectures , 2014 .
[25] T. Steinke,et al. On Improving the Performance of Multi-threaded CUDA Applications with Concurrent Kernel Execution by Kernel Reordering , 2012, 2012 Symposium on Application Accelerators in High Performance Computing.
[26] Nicola Capodieci,et al. A Perspective on Safety and Real-Time Issues for GPU Accelerated ADAS , 2018, IECON 2018 - 44th Annual Conference of the IEEE Industrial Electronics Society.