Vortex: Extending the RISC-V ISA for GPGPU and 3D-Graphics
暂无分享,去创建一个
Hyesoon Kim | Fares Elsabbagh | Blaise Tine | Krishna Praveen Yalamarthy | Hyesoon Kim | Blaise Tine | Fares Elsabbagh
[1] Vikram S. Adve,et al. LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..
[2] Mike Mantor,et al. AMD Radeon™ HD 7970 with graphics core next (GCN) architecture , 2012, 2012 IEEE Hot Chips 24 Symposium (HCS).
[3] Bringing OpenCL to Commodity RISC-V CPUs , 2021 .
[4] Onur Mutlu,et al. Improving GPU performance via large warps and two-level warp scheduling , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[5] Yunsup Lee,et al. A 45nm 1.3GHz 16.7 double-precision GFLOPS/W RISC-V processor with vector accelerators , 2014, ESSCIRC 2014 - 40th European Solid State Circuits Conference (ESSCIRC).
[6] Hyesoon Kim,et al. Tango: An Optimizing Compiler for Just-In-Time RTL Simulation , 2020, 2020 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[7] Luca Benini,et al. Ara: A 1-GHz+ Scalable and Energy-Efficient RISC-V Vector Processor With Multiprecision Floating-Point Support in 22-nm FD-SOI , 2019, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[8] Lars Bishop. OpenGL ES 1.1, 2.0 and EGL , 2006, SIGGRAPH Courses.
[9] Aaftab Munshi,et al. The OpenCL specification , 2009, 2009 IEEE Hot Chips 21 Symposium (HCS).
[10] Luca Benini,et al. A multi-banked shared-l1 cache architecture for tightly coupled processor clusters , 2012, 2012 International Symposium on System on Chip (SoC).
[11] Kevin Skadron,et al. Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[12] Henry Wong,et al. Analyzing CUDA workloads using a detailed GPU simulator , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.
[13] Sudhakar Yalamanchili,et al. Lightweight SIMT core designs for intelligent 3D stacked DRAM , 2017, MEMSYS.
[14] John Burgess,et al. RTX ON – The NVIDIA TURING GPU , 2019, 2019 IEEE Hot Chips 31 Symposium (HCS).
[15] Samuli Laine,et al. High-performance software rasterization on GPUs , 2011, HPG '11.
[16] Martin White,et al. MIP-Map Level Selection for Texture Mapping , 1998, IEEE Trans. Vis. Comput. Graph..
[17] Jie Cheng,et al. CUDA by Example: An Introduction to General-Purpose GPU Programming , 2010, Scalable Comput. Pract. Exp..
[18] John Wawrzynek,et al. Chisel: Constructing hardware in a Scala embedded language , 2012, DAC Design Automation Conference 2012.
[19] Jose Renau,et al. Fluid Pipelines: Elastic Circuitry without Throughput Penalty , 2016 .
[20] Ian Bratt,et al. The ARM® Mali-T880 Mobile GPU , 2015, 2015 IEEE Hot Chips 27 Symposium (HCS).
[21] Carlos González,et al. ATTILA: a cycle-level execution-driven simulator for modern GPU architectures , 2006, 2006 IEEE International Symposium on Performance Analysis of Systems and Software.
[22] Paolo Ienne,et al. Stop Crying Over Your Cache Miss Rate: Handling Efficiently Thousands of Outstanding Misses in FPGAs , 2019, FPGA.
[23] Russell Tessier,et al. FlexGrip: A soft GPGPU for FPGAs , 2013, 2013 International Conference on Field-Programmable Technology (FPT).
[24] Valerio Pascucci,et al. RTX beyond ray tracing: exploring the use of hardware ray tracing cores for tet-mesh point location , 2019, High Performance Graphics.
[25] Paolo Ienne,et al. Elastic CGRAs , 2013, FPGA '13.
[26] Karthikeyan Sankaralingam,et al. Dark Silicon and the End of Multicore Scaling , 2012, IEEE Micro.
[27] David A. Wood,et al. gem5-gpu: A Heterogeneous CPU-GPU Simulator , 2015, IEEE Computer Architecture Letters.
[28] Homan Igehy,et al. Prefetching in a texture cache architecture , 1998, Workshop on Graphics Hardware.
[29] Edward T. Grochowski,et al. Larrabee: A many-Core x86 architecture for visual computing , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).
[30] Arvind. Bluespec: A language for hardware design, simulation, synthesis and verification Invited Talk , 2003, MEMOCODE.
[31] Sylvain Collange,et al. Simty: generalized SIMT execution on RISC-V , 2017 .
[32] Hoi-Jun Yoo,et al. Mobile 3D Graphics SoC: From Algorithm to Chip , 2010 .
[33] Jason Helge Anderson,et al. Impact of Cache Architecture and Interface on Performance and Area of FPGA-Based Processor/Parallel-Accelerator Systems , 2012, 2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines.
[34] Marc Stamminger,et al. CPU-style SIMD ray traversal on GPUs , 2018, High Performance Graphics.
[35] Dieter Schmalstieg,et al. On-the-fly Vertex Reuse for Massively-Parallel Software Geometry Processing , 2018, PACMCGIT.
[36] Fares Elsabbagh,et al. Vortex: OpenCL Compatible RISC-V GPGPU , 2020, ArXiv.
[37] Tor M. Aamodt,et al. Emerald: Graphics Modeling for SoC Systems , 2019, 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA).
[38] Tor M. Aamodt,et al. Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[39] Ben Sander. HSAIL: Portable compiler IR for HSA , 2013, 2013 IEEE Hot Chips 25 Symposium (HCS).
[40] Aaron Carpenter,et al. Nyami: a synthesizable GPU architectural model for general-purpose and graphics-specific workloads , 2015, 2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[41] Timothy N. Miller,et al. NyuziRaster: Optimizing rasterizer performance and energy in the Nyuzi open source GPU , 2016, 2016 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[42] Michael Hübner,et al. FGPU: An SIMT-Architecture for FPGAs , 2016, FPGA.
[43] FengWu-chun,et al. The Green500 List , 2007 .
[44] Erik Brunvand,et al. Mach-RT: a many chip architecture for ray tracing , 2019, High Performance Graphics.
[45] J. Gregory Steffan,et al. Efficient multi-ported memories for FPGAs , 2010, FPGA '10.
[46] Karthikeyan Sankaralingam,et al. MIAOW - An open source RTL implementation of a GPGPU , 2015, 2015 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS XVIII).
[47] Matthew Poremba,et al. Lost in Abstraction: Pitfalls of Analyzing GPUs at the Intermediate Language Level , 2018, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[48] David R. Kaeli,et al. Multi2Sim: A simulation framework for CPU-GPU computing , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).
[49] Peter Bøgh Andersen. Elastic Systems , 2001, INTERACT.