A multi-FPGA based platform for emulating a 100m-transistor-scale processor with high-speed peripherals (abstract only)

This paper describes a multi-FPGA based platform for emulating the Loongson-2G micro-processor on different mother boards. This platform is developed targeting at verification and evaluation of the Loongson-2G micro-processor, which is the next generation of Loongson-2 family, composed by one four-issue, out-of-order execution way 64-bit MIPS-compatible processor core named GS464, one 1M byte secondary Cache, one HyperTransport IO interface, one DDR2/3 memory interface and some other low speed IO interfaces. Most parts of this micro-process are mapped into the multi-FPGA based platform which consists two Vertex-5 330 FPGA chips. Semi-custom partitioning tactics within the entire design flow are developed to synthesize the whole designed into the multi-FPGA based platform. Modifications in architectural level are applied to the original architecture of the chip, in order to make it easy to be partitioned into two parts. High speed SEDES of HyperTransport IO link and DDR2/3 memory interface are emulated by using several clocks with different clock phases. To resolve the problem that hard to debug in FPGA system, a method by software probe with help of injected hardware modules in FPGA is developed and used to debug the problem causing by behavior mismatching between the ASIC ram block and the FPGA ram block. Some evaluation work on performance of Loongson-2G is done on this multi-FPGA based platform as pre-silicon test. To the authors' knowledge, there has been no previous work on such a big design used for verification and evaluation.

[1]  Cristian Grecu,et al.  An FPGA Design Project: Creating a PowerPC Subsystem Plus User Logic , 2007, 2007 IEEE International Conference on Microelectronic Systems Education (MSE'07).

[2]  Derek Feltham,et al.  Pentium Pro Processor Design for Test and Debug , 1998, IEEE Des. Test Comput..

[3]  Roland E. Wunderlich,et al.  In-system FPGA prototyping of an Itanium microarchitecture , 2004, IEEE International Conference on Computer Design: VLSI in Computers and Processors, 2004. ICCD 2004. Proceedings..

[4]  Weiwu Hu,et al.  Microarchitecture of the Godson-2 Processor , 2005, Journal of Computer Science and Technology.

[5]  Peter Ateshian,et al.  ARM Synthesizable Design with Actel FPGAs: with Mixed-Signal SoC Applications (set 3) , 2010 .

[6]  Shih-Lien Lu,et al.  An FPGA-based Pentium® in a complete desktop system , 2007, FPGA '07.

[7]  Nan Jiang,et al.  A MIPS R2000 implementation , 2008, 2008 45th ACM/IEEE Design Automation Conference.

[8]  Weisong Shi,et al.  JIAJIA: A Software DSM System Based on a New Cache Coherence Protocol , 1999, HPCN Europe.

[9]  Xiang Gao,et al.  An Enhanced HyperTransport Controller with Cache Coherence Support for Multiple-CMP , 2009, 2009 IEEE International Conference on Networking, Architecture, and Storage.

[10]  David Patterson,et al.  An FPGA Host-Multithreaded Functional Model for SPARC v 8 , 2008 .

[11]  W. Hu,et al.  JIA-JIA : An SVM System Based on A New Cache Coherence Protocol , 1999 .

[12]  Christoforos E. Kozyrakis,et al.  RAMP: Research Accelerator for Multiple Processors , 2007, IEEE Micro.

[13]  Hong Wang,et al.  Intel® atom™ processor core made FPGA-synthesizable , 2009, FPGA '09.

[14]  Roland E. Wunderlich,et al.  In-System FPGA Prototyping of an Itanium Microarchitecture , 2004, ICCD.

[15]  Jian Wang,et al.  Godson-3: A Scalable Multicore RISC Processor with x86 Emulation , 2009, IEEE Micro.

[16]  Pat Conway,et al.  The AMD Opteron Processor for Multiprocessor Servers , 2003, IEEE Micro.

[17]  Ulrich Brüning,et al.  A versatile, low latency HyperTransport core , 2007, FPGA '07.

[18]  Gabriele Saucier,et al.  FPGA-Based Emulation: Industrial and Custom Prototyping Solutions , 2000, FPL.

[19]  Xu Yang,et al.  Implementing a 1GHz Four-Issue Out-of-Order Execution Microprocessor in a Standard Cell ASIC Methodology , 2007, Journal of Computer Science and Technology.