An Optimal Microarchitecture for Stencil Computation Acceleration Based on Nonuniform Partitioning of Data Reuse Buffers
暂无分享,去创建一个
[1] Jason Cong,et al. High-Level Synthesis for FPGAs: From Prototyping to Deployment , 2011, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[2] Jason Cong,et al. Memory partitioning and scheduling co-optimization in behavioral synthesis , 2012, 2012 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[3] Todor Stefanov,et al. pn: A Tool for Improved Derivation of Process Networks , 2007, EURASIP J. Embed. Syst..
[4] Paul Feautrier,et al. Some efficient solutions to the affine scheduling problem. I. One-dimensional time , 1992, International Journal of Parallel Programming.
[5] Jason Cong,et al. Polyhedral-based data reuse optimization for configurable computing , 2013, FPGA '13.
[6] Marc Snir,et al. GETTING UP TO SPEED THE FUTURE OF SUPERCOMPUTING , 2004 .
[7] David Atienza,et al. A high-level synthesis flow for the implementation of iterative stencil loop algorithms on FPGA devices , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).
[8] Samuel H. Fuller,et al. The Future of Computing Performance: Game Over or Next Level? , 2014 .
[9] Jason Cong,et al. Optimizing memory hierarchy allocation with loop transformations for high-level synthesis , 2012, DAC Design Automation Conference 2012.
[10] Jason Cong,et al. An Optimal Microarchitecture for Stencil Computation Acceleration Based on Nonuniform Partitioning of Data Reuse Buffers , 2014, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[11] Jason Cong,et al. CMOST: A system-level FPGA compilation framework , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).
[12] Jason Cong,et al. Automatic memory partitioning and scheduling for throughput and power optimization , 1999, 2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers.
[13] Yosi Ben-Asher,et al. Automatic Memory Partitioning: Increasing memory parallelism via data structure partitioning , 2010, 2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).
[14] Jason Cong,et al. Customizable Domain-Specific Computing , 2009, IEEE Design & Test of Computers.
[15] FeautrierPaul. Some efficient solutions to the affine scheduling problem , 1992 .
[16] Jason Cong,et al. Accelerator-rich CMPs: From concept to real hardware , 2013, 2013 IEEE 31st International Conference on Computer Design (ICCD).
[17] Jason Cong,et al. Memory partitioning for multidimensional arrays in high-level synthesis , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).
[18] J. Ramanujam,et al. Compile-Time Techniques for Data Distribution in Distributed Memory Machines , 1991, IEEE Trans. Parallel Distributed Syst..
[19] Mark Horowitz,et al. 1.1 Computing's energy problem (and what we can do about it) , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).
[20] Jason Cong,et al. An integrated and automated memory optimization flow for FPGA behavioral synthesis , 2012, 17th Asia and South Pacific Design Automation Conference.
[21] Jason Cong,et al. Improving high level synthesis optimization opportunity through polyhedral transformations , 2013, FPGA '13.