Memory Controller for Vector Processor

To manage power and memory wall affects, the HPC industry supports FPGA reconfigurable accelerators and vector processing cores for data-intensive scientific applications. FPGA based vector accelerators are used to increase the performance of high-performance application kernels. Adding more vector lanes does not affect the performance, if the processor/memory performance gap dominates. In addition if on/off-chip communication time becomes more critical than computation time, causes performance degradation. The system generates multiple delays due to application’s irregular data arrangement and complex scheduling scheme. Therefore, just like generic scalar processors, all sets of vector machine – vector supercomputers to vector microprocessors – are required to have data management and access units that improve the on/off-chip bandwidth and hide main memory latency. In this work, we propose an Advanced Programmable Vector Memory Controller (PVMC), which boosts noncontiguous vector data accesses by integrating descriptors of memory patterns, a specialized on-chip memory, a memory manager in hardware, and multiple DRAM controllers. We implemented and validated the proposed system on an Altera DE4 FPGA board. The PVMC is also integrated with ARM Cortex-A9 processor on Xilinx Zynq All-Programmable System on Chip architecture. We compare the performance of a system with vector and scalar processors without PVMC. When compared with a baseline vector system, the results show that the PVMC system transfers data sets up to 1.40x to 2.12x faster, achieves between 2.01x to 4.53x of speedup for 10 applications and consumes 2.56 to 4.04 times less energy.

[1]  Eduard Ayguadé,et al.  PPMC: A Programmable Pattern Based Memory Controller , 2012, ARC.

[2]  Jonathan Rose,et al.  VESPA: portable, scalable, and flexible FPGA-based vector processors , 2008, CASES '08.

[3]  Michael Weiss Strip mining on SIMD architectures , 1991, ICS '91.

[4]  Eduard Ayguadé,et al.  Implementation of a Reverse Time Migration kernel using the HCE High Level Synthesis tool , 2011, 2011 International Conference on Field-Programmable Technology.

[5]  Guy Lemieux,et al.  VENICE: A Compact Vector Processor for FPGA Applications , 2012, 2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines.

[6]  Rajeev Barua,et al.  Dynamic allocation for scratch-pad memory using compile-time decisions , 2006, TECS.

[7]  Richard M. Russell,et al.  The CRAY-1 computer system , 1978, CACM.

[8]  Eduard Ayguadé,et al.  PPMC: Hardware scheduling and memory management support for multi accelerators , 2012, 22nd International Conference on Field Programmable Logic and Applications (FPL).

[9]  Louise H. Crockett,et al.  The Zynq Book: Embedded Processing with the Arm Cortex-A9 on the Xilinx Zynq-7000 All Programmable Soc , 2014 .

[10]  Guy Lemieux,et al.  VEGAS: soft vector processor with scratchpad memory , 2011, FPGA '11.

[11]  Guy Lemieux,et al.  Vector Processing as a Soft Processor Accelerator , 2009, TRETS.

[12]  Jean-François Deverge,et al.  WCET-Directed Dynamic Scratchpad Memory Allocation of Data , 2007, 19th Euromicro Conference on Real-Time Systems (ECRTS'07).

[13]  Eduard Ayguadé,et al.  PVMC: Programmable Vector Memory Controller , 2014, 2014 IEEE 25th International Conference on Application-Specific Systems, Architectures and Processors.

[14]  Ting Chen,et al.  WCET centric data allocation to scratchpad memory , 2005, 26th IEEE International Real-Time Systems Symposium (RTSS'05).

[15]  Eduard Ayguadé,et al.  Stand-Alone Memory Controller for Graphics System , 2014, ARC.

[16]  Hui Cheng,et al.  Vector pipelining, chaining, and speed on the IBM 3090 and Cray X-MP , 1989, Computer.

[17]  Eduard Ayguadé Parra,et al.  Reconfigurable memory controller with programmable pattern support , 2011, HIPEAC 2011.

[18]  Erik Brunvand,et al.  Impulse: building a smarter memory controller , 1999, Proceedings Fifth International Symposium on High-Performance Computer Architecture.

[19]  Sally A. McKee,et al.  Dynamic Access Ordering for Streamed Computations , 2000, IEEE Trans. Computers.

[20]  Alexandru Nicolau,et al.  Memory Issues in Embedded Systems-on-Chip: Optimizations and Exploration , 1998 .

[21]  Mateo Valero,et al.  Vector architectures: past, present and future , 1998, ICS '98.

[22]  J. Gregory Steffan,et al.  The microarchitecture of FPGA-based soft processors , 2005, CASES '05.

[23]  Zhen Fang,et al.  The Impulse Memory Controller , 2001, IEEE Trans. Computers.

[24]  Tassadaq Hussain,et al.  PGC: a pattern-based graphics controller , 2014 .

[25]  Christoforos E. Kozyrakis,et al.  Overcoming the limitations of conventional vector processors , 2003, ISCA '03.

[26]  Eduard Ayguadé,et al.  Advanced Pattern based Memory Controller for FPGA based HPC applications , 2014, 2014 International Conference on High Performance Computing & Simulation (HPCS).

[27]  Peter Marwedel,et al.  Reducing energy consumption by dynamic copying of instructions onto onchip memory , 2002, 15th International Symposium on System Synthesis, 2002..