Providing Performance Guarantees in Multipass Network Processors

Current network processors (NPs) increasingly deal with packets with heterogeneous processing times. In such an environment, packets that require many processing cycles delay low-latency traffic because the common approach in today's NPs is to employ run-to-completion processing. These difficulties have led to the emergence of the Multipass NP architecture, where after a processing cycle ends, all processed packets are recycled into the buffer and recompete for processing resources. In this paper, we provide a model that captures many of the characteristics of this architecture, and we consider several scheduling and buffer management algorithms that are specially designed to optimize the performance of multipass network processors. In particular, we provide analytical guarantees for the throughput performance of our algorithms. We further conduct a comprehensive simulation study, which validates our results.

[1]  Michael Segal,et al.  Providing performance guarantees in multipass network processors , 2011, 2011 Proceedings IEEE INFOCOM.

[2]  Alexandra Fedorova,et al.  Addressing shared resource contention in multicore processors via scheduling , 2010, ASPLOS XV.

[3]  Yossi Azar,et al.  Management of Multi-Queue Switches in QoS Networks , 2003, STOC '03.

[4]  Tilman Wolf,et al.  Predictive scheduling of network processors , 2003, Comput. Networks.

[5]  Michael Meitinger,et al.  A Hardware Packet Re-Sequencer Unit for Network Processors , 2008, ARCS.

[6]  Kirk Pruhs Competitive online scheduling for server systems , 2007, PERV.

[7]  Xin Huang,et al.  Evaluating Dynamic Task Mapping in Network Processor Runtime Systems , 2008, IEEE Transactions on Parallel and Distributed Systems.

[8]  R. Govindarajan,et al.  Packet Reordering in Network Processors , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[9]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[10]  Xipeng Shen,et al.  A study on optimally co-scheduling jobs of different lengths on chip multiprocessors , 2009, CF '09.

[11]  Mohammad Banikazemi,et al.  PAM: A novel performance/power aware meta-scheduler for multi-core systems , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[12]  Alexandra Fedorova,et al.  Addressing shared resource contention in multicore processors via scheduling , 2010, ASPLOS 2010.

[13]  Fred Kuhns,et al.  A remotely accessible network processor-based router for network experimentation , 2008, ANCS '08.

[14]  Kirk Pruhs,et al.  Speed Scaling with an Arbitrary Power Function , 2009, TALG.

[15]  Patrick Crowley,et al.  Performance/area efficiency in chip multiprocessors with micro-caches , 2007, CF '07.

[16]  Tong Li,et al.  Using OS Observations to Improve Performance in Multicore Systems , 2008, IEEE Micro.

[17]  Rajmohan Rajaraman,et al.  Online scheduling to minimize average stretch , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[18]  Susanne Albers,et al.  On the Performance of Greedy Algorithms in Packet Buffering , 2005, SIAM J. Comput..

[19]  Yossi Azar,et al.  An improved algorithm for CIOQ switches , 2004, TALG.

[20]  Allan Borodin,et al.  Online computation and competitive analysis , 1998 .

[21]  Nicholas Bambos,et al.  Throughput loss in task scheduling due to server state uncertainty , 2009, VALUETOOLS.

[22]  George Varghese,et al.  A pipelined memory architecture for high throughput network processors , 2003, ISCA '03.

[23]  Kirk Pruhs,et al.  Online weighted flow time and deadline scheduling , 2006, J. Discrete Algorithms.

[24]  Linus Schrage,et al.  Letter to the Editor - A Proof of the Optimality of the Shortest Remaining Processing Time Discipline , 1968, Oper. Res..

[25]  Kirk Pruhs,et al.  Online weighted flow time and deadline scheduling , 2001, J. Discrete Algorithms.

[26]  Yossi Azar,et al.  Maximizing Throughput in Multi-queue Switches , 2004, ESA.

[27]  Stefano Leonardi,et al.  Approximating total flow time on parallel machines , 1997, STOC '97.

[28]  Robert E. Tarjan,et al.  Amortized efficiency of list update and paging rules , 1985, CACM.

[29]  Andreas Herkersdorf,et al.  A folded pipeline network processor architecture for 100 Gbit/s networks , 2010, 2010 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS).

[30]  Tilman Wolf,et al.  Analytic modeling of network processors for parallel workload mapping , 2009, TECS.

[31]  Rafail Ostrovsky,et al.  Dynamic routing on networks with fixed-size buffers , 2003, SODA '03.

[32]  Yishay Mansour,et al.  Harmonic buffer management policy for shared memory switches , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[33]  Gabriel Scalosub,et al.  Buffer Management for Aggregated Streaming Data with Packet Dependencies , 2010, IEEE Transactions on Parallel and Distributed Systems.

[34]  Susanne Albers,et al.  An Experimental Study of New and Known Online Packet Buffering Algorithms , 2007, Algorithmica.

[35]  Boaz Patt-Shamir,et al.  Buffer overflows of merging streams , 2003, SPAA '03.

[36]  Yishay Mansour,et al.  Competitve buffer management for shared-memory switches , 2001, SPAA '01.

[37]  Jie Wang,et al.  Analytical Performance Analysis of Network-Processor-Based Application Designs , 2006, Proceedings of 15th International Conference on Computer Communications and Networks.

[38]  H. Vin,et al.  A Case for Data Caching in Network Processors , 2022 .

[39]  Boaz Patt-Shamir,et al.  Buffer overflow management in QoS switches , 2001, STOC '01.