An Interprocedural Code Optimization Technique for Network Processors Using Hardware Multi-Threading Support

Sophisticated C compiler support for network processors (NPUs) is required to improve their usability and consequently, their acceptance in system design. Nonetheless, high-level code compilation always introduces overhead, regarding code size and performance compared to handwritten assembly code. This overhead result partially from high-level function calls that usually introduce memory accesses in order to save and reload registers contents. A key feature of many NPU architectures is hardware multithreading support, in the form of separate register files, for fast context switching between different application tasks. In this paper, a new NPU code optimization technique to use such HW contexts is presented that minimizes the overhead for saving and reloading register contents for function calls via the runtime stack. The feasibility and the performance gain of this technique are demonstrated for the Infineon Technologies PP32 NPU architecture and typical network application kernels

[1]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[2]  Yunheung Paek,et al.  Experience with a retargetable compiler for a commercial network processor , 2002, CASES '02.

[3]  Gerhard Fettweis,et al.  A new network processor architecture for high-speed communications , 1999, 1999 IEEE Workshop on Signal Processing Systems. SiPS 99. Design and Implementation (Cat. No.99TH8461).

[4]  Kurt Keutzer,et al.  Building ASIPs: The Mescal Methodology , 2006 .

[5]  S. Pande,et al.  Resolving register bank conflicts for a network processor , 2003, 2003 12th International Conference on Parallel Architectures and Compilation Techniques.

[6]  Rainer Leupers,et al.  C compiler design for a network processor , 2001, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[7]  Kingshuk Karuri,et al.  A methodology and tool suite for C compiler generation from ADL processor models , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[8]  Preston Briggs,et al.  Register allocation , 2003 .

[9]  David W. Wall,et al.  Register windows vs. register allocation , 1988, SIGP.

[10]  Kurt Keutzer,et al.  Network Processors: Origin of Species , 2002 .

[11]  Santosh Pande,et al.  Balancing register allocation across threads for a multithreaded network processor , 2004, PLDI '04.

[12]  R. Leupers,et al.  Compiler Design for a Network Processor , 2002 .

[13]  Heinrich Meyr,et al.  A methodology for the design of application specific instruction set processors (ASIP) using the machine description language LISA , 2001, IEEE/ACM International Conference on Computer Aided Design. ICCAD 2001. IEEE/ACM Digest of Technical Papers (Cat. No.01CH37281).

[14]  Heinrich Meyr,et al.  A novel methodology for the design of application-specificinstruction-set processors (ASIPs) using a machine description language , 2001, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..