A Tree Based Router Search Engine Architecture with Single Port Memories

Pipelined forwarding engines are used in core routers to meet speed demands. Tree-based searches are pipelined across a number of stages to achieve high throughput, but this results in unevenly distributed memory. To address this imbalance, conventional approaches use either complex dynamic memory allocation schemes or over-provision each of the pipeline stages. This paper describes the microarchitecture of a novel network search processor which provides both high execution throughput and balanced memory distribution by dividing the tree into subtrees and allocating each subtree separately, allowing searches to begin at any pipeline stage. The architecture is validated by implementing and simulating state of the art solutions for IPv4 lookup, VPN forwarding and packet classification. The new pipeline scheme and memory allocator can provide searches with a memory allocation efficiency that is within 1% of non-pipelined schemes.

[1]  Nick McKeown,et al.  Classifying Packets with Hierarchical Intelligent Cuttings , 2000, IEEE Micro.

[2]  Frank Thomson Leighton,et al.  Processor-Ring Communication: A Tight Asymptotic Bound on Packet Waiting Times , 1998, SIAM J. Comput..

[3]  Thomas Y. C. Woo A modular approach to packet classification: algorithms and results , 2000, Proceedings IEEE INFOCOM 2000. Conference on Computer Communications. Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies (Cat. No.00CH37064).

[4]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[5]  Luiz André Barroso,et al.  The performance of cache-coherent ring-based multiprocessors , 1993, ISCA '93.

[6]  Michael Stumm,et al.  A performance comparison of hierarchical ring- and mesh-connected multiprocessor networks , 1997, Proceedings Third International Symposium on High-Performance Computer Architecture.

[7]  Yakov Rekhter,et al.  BGP/MPLS VPNs , 1999, RFC.

[8]  George Varghese,et al.  Scalable packet classification , 2001, TNET.

[9]  Nick McKeown,et al.  Packet classification on multiple fields , 1999, SIGCOMM '99.

[10]  George Varghese,et al.  Packet classification using multidimensional cutting , 2003, SIGCOMM '03.

[11]  Jean-Didier Legat,et al.  A low-power multiprocessor architecture for embedded reconfigurable systems , 1998 .

[12]  Pankaj Gupta,et al.  Packet Classification using Hierarchical Intelligent Cuttings , 1999 .

[13]  Jean-Loup Baer,et al.  Memory hierarchy design for a multiprocessor look-up engine , 2003, 2003 12th International Conference on Parallel Architectures and Compilation Techniques.

[14]  George Varghese,et al.  A pipelined memory architecture for high throughput network processors , 2003, ISCA '03.

[15]  Girija J. Narlikar,et al.  Fast incremental updates for pipelined forwarding engines , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[16]  George Varghese,et al.  Memory-efficient state lookups with fast updates , 2000, SIGCOMM 2000.

[17]  George Varghese,et al.  Fast and scalable layer four switching , 1998, SIGCOMM '98.

[18]  George Varghese,et al.  The impact of address allocation and routing on the structure and implementation of routing tables , 2003, SIGCOMM '03.

[19]  Walid Dabbous,et al.  Survey and taxonomy of IP address lookup algorithms , 2001, IEEE Netw..

[20]  Albert G. Greenberg,et al.  An Approximate Model of Processor Communication Rings Under Heavy Load , 1997, Inf. Process. Lett..