A combined fast/cycle accurate simulation tool for reconfigurable accelerator evaluation: application to distributed data management
暂无分享,去创建一个
[1] Daniel M. Dreps,et al. IBM POWER9 opens up a new era of acceleration enablement: OpenCAPI , 2018, IBM J. Res. Dev..
[2] Somayeh Sardashti,et al. The gem5 simulator , 2011, CARN.
[3] Christian Steger,et al. A software performance simulation methodology for rapid system architecture exploration , 2008, 2008 15th IEEE International Conference on Electronics, Circuits and Systems.
[4] Mathieu Jan,et al. JuxMem: An Adaptive Supportive Platform for Data Sharing on the Grid , 2001, Scalable Comput. Pract. Exp..
[5] Kenneth B. Kent,et al. Simulation-Based Circuit-Activity Estimation for FPGAs Containing Hard Blocks , 2017, 2017 International Symposium on Rapid System Prototyping (RSP).
[6] Fred G. Gustavson,et al. Two Fast Algorithms for Sparse Matrices: Multiplication and Permuted Transposition , 1978, TOMS.
[7] Jason Cong,et al. PARADE: A cycle-accurate full-system simulation Platform for Accelerator-Rich Architectural Design and Exploration , 2015, 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[8] Timothy A. Davis,et al. The university of Florida sparse matrix collection , 2011, TOMS.
[9] Kevin E. Murray,et al. VTR 8: High Performance CAD and Customizable FPGA Architecture Modelling , 2020 .
[10] Nicolas Ventroux,et al. Hybrid Prototyping Methodology for Rapid System Validation in HW/SW Co-Design , 2019, 2019 Conference on Design and Architectures for Signal and Image Processing (DASIP).
[11] Gu-Yeon Wei,et al. Co-designing accelerators and SoC interfaces using gem5-Aladdin , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[12] Lieven Eeckhout. Heterogeneity in Response to the Power Wall , 2015, IEEE Micro.
[13] Kai Li,et al. IVY: A Shared Virtual Memory System for Parallel Computing , 1988, ICPP.
[14] Wei Zhang,et al. PAAS: A system level simulator for heterogeneous computing architectures , 2017, 2017 27th International Conference on Field Programmable Logic and Applications (FPL).
[15] Jacob Nelson,et al. Latency-Tolerant Software Distributed Shared Memory , 2015, USENIX Annual Technical Conference.
[16] James C. Hoe,et al. FIST: A fast, lightweight, FPGA-friendly packet latency estimator for NoC modeling in full-system simulations , 2011, Proceedings of the Fifth ACM/IEEE International Symposium.
[17] Alan L. Cox,et al. TreadMarks: shared memory computing on networks of workstations , 1996 .
[18] David R. Kaeli,et al. Multi2Sim: A simulation framework for CPU-GPU computing , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).
[19] Wei Zhang,et al. HeteroSim: A heterogeneous CPU-FPGA simulator , 2016, 2016 26th International Conference on Field Programmable Logic and Applications (FPL).
[20] James A. Ross,et al. Implementing OpenSHMEM for the Adapteva Epiphany RISC Array Processor , 2016, ICCS.
[21] Jason Cong,et al. High-Level Synthesis for FPGAs: From Prototyping to Deployment , 2011, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[22] Loïc Cudennec. Software-Distributed Shared Memory over Heterogeneous Micro-server Architecture , 2017, Euro-Par Workshops.
[23] Stefanos Kaxiras,et al. Turning Centralized Coherence and Distributed Critical-Section Execution on their Head: A New Approach for Scalable Distributed Shared Memory , 2015, HPDC.
[24] Loïc Cudennec. Merging the Publish-Subscribe Pattern with the Shared Memory Paradigm , 2018, Euro-Par Workshops.