An Automated Temporal Partitioning Tool for a class of DSP applications

We present an automated temporal partitioning tool for developing dynamically reconngurable designs starting from behavior level speciications for a class of dsp applications. An Integer Linear Programming (ilp) model is formulated to achieve near-optimal latency designs. We, also present a loop restructuring method to achieve maximum throughput for a class of dsp applications. This restructuring transformation is performed on the temporally partitioned behavior and results in near-optimization of throughput. Case study on the Joint Photographic Experts Group (jpeg) image compression algorithm demonstrates the eeectiveness of our approach. 1 Introduction fpgas have been used successfully in the rapid proto-typing of designs 1, 2]. The long fabrication times associated with asic design is eliminated. But the device capacity of fpgas is far less than that of asic chips. Therefore, when synthesizing large designs on fpgas usually multi-fpga boards are used to increase device capacity. This necessitates spatial partitioning of the application. In this style of static fpga design, the fpga is conngured once at the start of the application , and the same connguration continues till the execution ends. However, the reconnguration capability of the sram fpgas can be utilized to t a large application onto the fpga by partitioning the application over time into multiple segments. The division of an application into temporal segments that are conngured one after the other on the fpga is called temporal partitioning. The rst temporal partition receives input data, performs computations and stores the intermediate result into the on-board memory. The device is then recon-gured for the next segment, which computes results based on the intermediate data, from the previous partition. Such temporally partitioned designs are called Run-Time Reconngured (rtr) systems. Results of manual temporal partitioning of an application were presented recently in 4]. An image interpolation process was implemented by temporally partitioning the process into stages. The comparability of the temporally partitioned design to other commercial systems is shown and clearly demonstrates the feasibility of this approach. However, the division of the application into temporal partitions is done by hand. Therefore , techniques to automatically partition designs temporally are needed. In this paper, we present a technique to temporally partition designs such that the reconngu-ration overhead of the rtr design is minimized.