暂无分享,去创建一个
[1] Yu Wang,et al. ForeGraph: Exploring Large-scale Graph Processing on Multi-FPGA Architecture , 2017, FPGA.
[2] Jason Cong,et al. Overcoming Data Transfer Bottlenecks in FPGA-based DNN Accelerators via Layer Conscious Memory Management , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).
[3] Willy Zwaenepoel,et al. X-Stream: edge-centric graph processing using streaming partitions , 2013, SOSP.
[4] Duncan H. Lawrie,et al. Access and Alignment of Data in an Array Processor , 1975, IEEE Transactions on Computers.
[5] Jason Cong,et al. FLASH: Fast, Parallel, and Accurate Simulator for HLS , 2020, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[6] Tanner Young-Schultz,et al. Using OpenCL to Enable Software-like Development of an FPGA-Accelerated Biophotonic Cancer Treatment Simulator , 2020, FPGA.
[7] Jason Cong,et al. High-Level Synthesis for FPGAs: From Prototyping to Deployment , 2011, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[8] Jason Cong,et al. PolySA: Polyhedral-Based Systolic Array Auto-Compilation , 2018, 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[9] John Wawrzynek,et al. AutoPhase: Juggling HLS Phase Orderings in Random Forests with Deep Reinforcement Learning , 2020, MLSys.
[10] Hyuk-Jae Lee,et al. Generalized Cannon's algorithm for parallel matrix multiplication , 1997, ICS '97.
[11] Norbert Wehn,et al. When Massive GPU Parallelism Ain't Enough: A Novel Hardware Architecture of 2D-LSTM Neural Network , 2020, FPGA.
[12] Jason Cong,et al. Rapid Cycle-Accurate Simulator for High-Level Synthesis , 2019, FPGA.
[13] Jason Cong,et al. SODA: Stencil with Optimized Dataflow Architecture , 2018, 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[14] Satoshi Matsuoka,et al. Combined Spatial and Temporal Blocking for High-Performance Stencil Computation on FPGAs Using OpenCL , 2018, FPGA.
[15] Lise Getoor,et al. Collective Classi!cation in Network Data , 2008 .
[16] Onur Mutlu,et al. Boyi: A Systematic Framework for Automatically Deciding the Right Execution Model of OpenCL Applications on FPGAs , 2020, FPGA.
[17] John Wawrzynek,et al. Chisel: Constructing hardware in a Scala embedded language , 2012, DAC Design Automation Conference 2012.
[18] Jiuxi Meng,et al. High-Performance FPGA Network Switch Architecture , 2020, FPGA.
[19] Viktor K. Prasanna,et al. HitGraph: High-throughput Graph Processing Framework on FPGA , 2019, IEEE Transactions on Parallel and Distributed Systems.
[20] Peng Zhang. Automated Accelerator Generation and Optimization with Composable, Parallel and Pipeline Architecture , 2018, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).
[21] Gilles Kahn,et al. The Semantics of a Simple Language for Parallel Programming , 1974, IFIP Congress.
[22] Xuan Yang,et al. Programming Heterogeneous Systems from an Image Processing DSL , 2016, ACM Trans. Archit. Code Optim..
[23] Jason Cong,et al. Exploiting Computation Reuse for Stencil Accelerators , 2020, 2020 57th ACM/IEEE Design Automation Conference (DAC).
[24] Jing Li,et al. Accelerating Graph Analytics by Co-Optimizing Storage and Access on an FPGA-HMC Platform , 2018, FPGA.
[25] Steven J. E. Wilton,et al. Fast Turnaround HLS Debugging Using Dependency Analysis and Debug Overlays , 2020, ACM Trans. Reconfigurable Technol. Syst..
[26] James L. Peterson,et al. Petri Nets , 1977, CSUR.
[27] Jason Cong,et al. HeteroHalide: From Image Processing DSL to Efficient FPGA Acceleration , 2020, FPGA.
[28] C. A. R. Hoare,et al. Communicating sequential processes , 1978, CACM.
[29] Yu Wang,et al. FPGP: Graph Processing Framework on FPGA A Case Study of Breadth-First Search , 2016, FPGA.
[30] Yu Ting Chen,et al. EASY: Efficient Arbiter SYnthesis from Multi-threaded Code , 2019, FPGA.
[31] Jason Helge Anderson,et al. LegUp: high-level synthesis for FPGA-based processor/accelerator systems , 2011, FPGA '11.
[32] Zhiru Zhang,et al. GraphZoom: A multi-level spectral approach for accurate and scalable graph embedding , 2020, ICLR.
[33] Rajeev Motwani,et al. The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.
[34] Viktor Prasanna,et al. GraphACT: Accelerating GCN Training on CPU-FPGA Heterogeneous Platforms , 2019, FPGA.
[35] Jure Leskovec,et al. Learning to Discover Social Circles in Ego Networks , 2012, NIPS.
[36] Roberto Ierusalimschy,et al. Revisiting coroutines , 2009, TOPL.
[37] Jason Cong,et al. Latte: Locality Aware Transformation for High-Level Synthesis , 2018, 2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).
[38] E.A. Lee,et al. Synchronous data flow , 1987, Proceedings of the IEEE.
[39] Paolo Ienne,et al. Combining Dynamic & Static Scheduling in High-level Synthesis , 2020, FPGA.
[40] Soojung Ryu,et al. SimParallel: A high performance parallel SystemC simulator using hierarchical multi-threading , 2014, 2014 IEEE International Symposium on Circuits and Systems (ISCAS).
[41] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[42] Torsten Hoefler,et al. Flexible Communication Avoiding Matrix Multiplication on FPGA with High-Level Synthesis , 2019, FPGA.
[43] Jason Cong,et al. An efficient and versatile scheduling algorithm based on SDC formulation , 2006, 2006 43rd ACM/IEEE Design Automation Conference.
[44] Jason Cong,et al. Analysis and Optimization of the Implicit Broadcasts in FPGA HLS to Improve Maximum Frequency , 2020, 2020 57th ACM/IEEE Design Automation Conference (DAC).
[45] Tim Schmidt,et al. Exploiting thread and data level parallelism for ultimate parallel SystemC simulation , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).
[46] Pat Hanrahan,et al. Fleet: A Framework for Massively Parallel Streaming on FPGAs , 2020, ASPLOS.
[47] Michael Ferdman,et al. FPGA-Accelerated Samplesort for Large Data Sets , 2020, FPGA.
[48] L. Dagum,et al. OpenMP: an industry standard API for shared-memory programming , 1998 .
[49] Yu Wang,et al. NXgraph: An efficient graph processing system on a single machine , 2015, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).
[50] Jing Li,et al. Degree-aware Hybrid Graph Traversal on FPGA-HMC Platform , 2018, FPGA.
[51] Jason Cong,et al. End-to-End Optimization of Deep Learning Applications , 2020, FPGA.
[52] Jin Hee Kim,et al. High-Level Synthesis Techniques to Generate Deeply Pipelined Circuits for FPGAs with Registered Routing , 2019, 2019 International Conference on Field-Programmable Technology (ICFPT).
[53] Jure Leskovec,et al. Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters , 2008, Internet Math..
[54] Melvin E. Conway,et al. Design of a separable transition-diagram compiler , 1963, CACM.
[55] Jason Cong,et al. ST-Accel: A High-Level Programming Platform for Streaming Applications on FPGA , 2018, 2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).
[56] Magnus Jahre,et al. DCMI , 2019, ACM Trans. Archit. Code Optim..
[57] James C. Hoe,et al. GraphGen: An FPGA Framework for Vertex-Centric Graph Computation , 2014, 2014 IEEE 22nd Annual International Symposium on Field-Programmable Custom Computing Machines.
[58] James C. Hoe,et al. Processor Assisted Worklist Scheduling for FPGA Accelerated Graph Processing on a Shared-Memory Platform , 2019, 2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).