GPU-SAM: Leveraging multi-GPU split-and-merge execution for system-wide real-time support
暂无分享,去创建一个
Insik Shin | Wookhyun Han | Hoon Sung Chwa | Hyosu Kim | Hwidong Bae | I. Shin | H. Chwa | Hyosu Kim | Hwidong Bae | Wookhyun Han
[1] Michael González Harbour,et al. Exploiting precedence relations in the schedulability analysis of distributed real-time systems , 1999, Proceedings 20th IEEE Real-Time Systems Symposium (Cat. No.99CB37054).
[2] John Freeman,et al. From opencl to high-performance hardware on FPGAS , 2012, 22nd International Conference on Field Programmable Logic and Applications (FPL).
[3] Wei Zhang,et al. Scratchpad Memory Architectures and Allocation Algorithms for Hard Real-Time Multicore Processors , 2015, J. Comput. Sci. Eng..
[4] Hyesoon Kim,et al. Qilin: Exploiting parallelism on heterogeneous multiprocessors with adaptive mapping , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[5] Chi-Bang Kuan,et al. Enabling an OpenCL Compiler for Embedded Multicore DSP Systems , 2012, 2012 41st International Conference on Parallel Processing Workshops.
[6] Andreas Dietrich,et al. OptiX: a general purpose ray tracing engine , 2010, SIGGRAPH 2010.
[7] Giuseppe Lipari,et al. Improved schedulability analysis of real-time transactions with earliest deadline scheduling , 2005, 11th IEEE Real Time and Embedded Technology and Applications Symposium.
[8] Doris Chen,et al. Invited paper: Using OpenCL to evaluate the efficiency of CPUS, GPUS and FPGAS for information filtering , 2012, 22nd International Conference on Field Programmable Logic and Applications (FPL).
[9] James H. Anderson,et al. Robust Real-Time Multiprocessor Interrupt Handling Motivated by GPUs , 2012, 2012 24th Euromicro Conference on Real-Time Systems.
[10] Kyoung-Don Kang,et al. Supporting Preemptive Task Executions and Memory Copies in GPGPUs , 2012, 2012 24th Euromicro Conference on Real-Time Systems.
[11] Wei Zhang,et al. Bounding Worst-Case DRAM Performance on Multicore Processors , 2013, J. Comput. Sci. Eng..
[12] Maurice Steinman,et al. AMD Fusion APU: Llano , 2012, IEEE Micro.
[13] Venkatesan Muthukumar,et al. Energy Aware Scheduling of Aperiodic Real-Time Tasks on Multiprocessor Systems , 2013, J. Comput. Sci. Eng..
[14] Wei Zhang,et al. Exploiting Standard Deviation of CPI to Evaluate Architectural Time-Predictability , 2014, J. Comput. Sci. Eng..
[15] Francisco Tirado,et al. Multi-GPU based on multicriteria optimization for motion estimation system , 2013, EURASIP Journal on Advances in Signal Processing.
[16] Wei Zhang,et al. Multicore-Aware Code Co-Positioning to Reduce WCET on Dual-Core Processors with Shared Instruction Caches , 2012, J. Comput. Sci. Eng..
[17] Björn Andersson,et al. Makespan Computation for GPU Threads Running on a Single Streaming Multiprocessor , 2012, 2012 24th Euromicro Conference on Real-Time Systems.
[18] Lei Zhou,et al. DART-CUDA: A PGAS Runtime System for Multi-GPU Systems , 2015, 2015 14th International Symposium on Parallel and Distributed Computing.
[19] Peter M. Athanas,et al. Enabling development of OpenCL applications on FPGA platforms , 2013, 2013 IEEE 24th International Conference on Application-Specific Systems, Architectures and Processors.
[20] John A. Clark,et al. Holistic schedulability analysis for distributed hard real-time systems , 1994, Microprocess. Microprogramming.
[21] Claus B. Madsen,et al. A scalable GPU-based approach to shading and shadowing for photorealistic real-time augmented reality , 2007, GRAPP.
[22] Michael González Harbour,et al. Schedulability analysis for tasks with static and dynamic offsets , 1998, Proceedings 19th IEEE Real-Time Systems Symposium (Cat. No.98CB36279).
[23] Wei Zhang,et al. Two-Level Scratchpad Memory Architectures to Achieve Time Predictability and High Performance , 2014, J. Comput. Sci. Eng..
[24] Sebastian Hack,et al. Improving Performance of OpenCL on CPUs , 2012, CC.
[25] Christophe Jaillet,et al. MultiGPU computing using MPI or OpenMP , 2010, Proceedings of the 2010 IEEE 6th International Conference on Intelligent Computer Communication and Processing.
[26] John D. Owens,et al. Multi-GPU MapReduce on GPU Clusters , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.
[27] Wang Yi,et al. New Response Time Bounds for Fixed Priority Multiprocessor Scheduling , 2009, 2009 30th IEEE Real-Time Systems Symposium.
[28] Michael González Harbour,et al. Offset-based response time analysis of distributed systems scheduled under EDF , 2003, 15th Euromicro Conference on Real-Time Systems, 2003. Proceedings..
[29] James H. Anderson,et al. GPUSync: A Framework for Real-Time GPU Management , 2013, 2013 IEEE 34th Real-Time Systems Symposium.
[30] Li Li,et al. Speculative Parallelism Characterization Profiling in General Purpose Computing Applications , 2015, J. Comput. Sci. Eng..
[31] R. Govindarajan,et al. Fluidic Kernels: Cooperative Execution of OpenCL Programs on Multiple Heterogeneous Devices , 2014, CGO '14.
[32] Darius Burschka,et al. Efficient occupancy grid computation on the GPU with lidar and radar for road boundary detection , 2010, 2010 IEEE Intelligent Vehicles Symposium.
[33] Pavan Nagendra. Performance characterization of automotive computer vision systems using Graphics Processing Units (GPUs) , 2011, 2011 International Conference on Image Information Processing.
[34] Keshab K. Parhi,et al. Semiblind frequency-domain timing synchronization and channel estimation for OFDM systems , 2013, EURASIP J. Adv. Signal Process..
[35] Jack J. Dongarra,et al. From CUDA to OpenCL: Towards a performance-portable solution for multi-platform GPU programming , 2012, Parallel Comput..
[36] Wei Zhang,et al. Multicore Real-Time Scheduling to Reduce Inter-Thread Cache Interferences , 2013, J. Comput. Sci. Eng..
[37] Eduardo Cabal-Yepez,et al. Early Experiences with OpenCL on FPGAs: Convolution Case Study , 2015, 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines.
[38] James H. Anderson,et al. Exploring the Multitude of Real-Time Multi-GPU Configurations , 2014, 2014 IEEE Real-Time Systems Symposium.
[39] Kang G. Shin,et al. Improvement of Real-Time Multi-CoreSchedulability with Forced Non-Preemption , 2014, IEEE Transactions on Parallel and Distributed Systems.
[40] Jungwon Kim,et al. Achieving a single compute device image in OpenCL for multiple GPUs , 2011, PPoPP '11.
[41] Scott A. Mahlke,et al. Transparent CPU-GPU collaboration for data-parallel kernels on heterogeneous systems , 2013, Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques.
[42] Shinpei Kato,et al. Gdev: First-Class GPU Resource Management in the Operating System , 2012, USENIX Annual Technical Conference.
[43] Shinpei Kato,et al. RGEM: A Responsive GPGPU Execution Model for Runtime Engines , 2011, 2011 IEEE 32nd Real-Time Systems Symposium.
[44] Jinkyu Lee,et al. Global EDF Schedulability Analysis for Synchronous Parallel Tasks on Multicore Platforms , 2013, 2013 25th Euromicro Conference on Real-Time Systems.
[45] Marko Bertogna,et al. Response-Time Analysis for Globally Scheduled Symmetric Multiprocessor Platforms , 2007, RTSS 2007.