A 64-way VLIW/SIMD FPGA architecture and design flow

Current FPGA architectures are heterogeneous, containing tens of thousands of logic elements and hundreds of embedded multipliers and memory units. However, efficiently utilizing these resources requires hardware designers and complex computer aided design tools. The paper describes several multi-processor architectures implemented on an FPGA, including a 64-way single interface multiple data (SIMD) and a variable size very long instruction word (VLIW) architecture. The design and synthesis of the target architectures are presented and compared for scalability and achieving parallelism. The performance and chip utilization of a shared register file is examined for different numbers of VLIW processing elements. The associated compilation flow is described based on the Trimaran VLIW compiler which achieves explicitly parallel instructions from C code. Benchmarks from the Media-Bench suite are being used to test the performance of the parallelism of both the software and hardware components.