Backtracking Optimized DDG Directed Scheduling Algorithm for Clustered VLIW Architectures

This work presents an instruction schedule approach to improve the performance of clustered VLIW architectures. The proposed scheme is based on a preliminary scheduling phase directed though analyzing of Data Dependence Graph (DDG) and a backtracking optimization scheduling phase bringing further improvement by balancing the workloads through clusters and minimizing the penalties of inter-cluster data communications simultaneously. We have implemented and evaluated the proposed scheme with UTDSP benchmarks. Results show a significant speed-up in performance. The speedup can up to 38.58%, with average speedup ranging from 23.91% (2-Clusters) to up to 26.78% (4-Clusters).

[1]  Kemal Ebcioglu,et al.  CARS: a new code generation framework for clustered ILP processors , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.

[2]  Antonio González,et al.  Graph-partitioning based instruction scheduling for clustered processors , 2001, MICRO.

[3]  Philip H. Sweany,et al.  A Code Generation Framework for VLIW Architectures with Partitioned Register Banks , 2007 .

[4]  Scott A. Mahlke,et al.  Compiler-directed data partitioning for muiticluster processors , 2006, International Symposium on Code Generation and Optimization (CGO'06).

[5]  Zhixiong Zhou,et al.  A 2-Dimension Force-Directed Scheduling Algorithm for Register-File-Connectivity Clustered VLIW Architecture , 2007, 2007 IEEE International Conf. on Application-specific Systems, Architectures and Processors (ASAP).

[6]  Nikil D. Dutt,et al.  Partitioned register files for VLIWs: a preliminary analysis of tradeoffs , 1992, MICRO 25.

[7]  Michael Wolfe,et al.  High performance compilers for parallel computing , 1995 .

[8]  Thomas M. Conte,et al.  Unified assign and schedule: a new approach to scheduling for clustered register file microarchitectures , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.

[9]  J. M. Codina,et al.  Virtual Cluster Scheduling Through the Scheduling Graph , 2007, International Symposium on Code Generation and Optimization (CGO'07).

[10]  Xu Yang,et al.  A Scaleable DSP System for ASIP Design , 2006, 2006 IEEE Asian Solid-State Circuits Conference.

[11]  Easwaran Raman,et al.  Integrating a New Cluster Assignment and Scheduling Algorithm into an Experimental Retargetable Code Generation Framework , 2005, HiPC.