High-performance implementation for graph-based packet classification algorithm on network processor

In this paper, we present a high-performance implementation for the search function of a graph-based packet classification algorithm, used by networking applications in Internet routers, on the Intel IXP1200 network processor. The implementation uses optimal consolidation of memory reads to reduce the number of expensive SRAM accesses. Also, the implementation inserts instructions after SRAM accesses to hide the memory access latencies and improve processor utilization. Experimental results show the performance of the implemented search function on the IXP1200 using five microengines at 166 MHz can be as high as 1.18 Msps (million searches per second), which satisfies the requirements of packet rates from OC-3 or fast Ethernet and up to OC-12 or Gigabit Ethernet. The methods presented here can also be adapted to other network processors with similar architectures.