Exploration of energy efficient acceleration concepts for the ROHCv2 in LTE handsets

In this paper, we present different acceleration concepts for the Robust Header Compression version 2 (ROHCv2) algorithms in Long Term Evolution (LTE) handsets. First, we explore the potential performance improvements and energy savings by adopting scratchpad memories at various sizes. Second, dedicated hardware accelerators with different data transfer modes are compared in terms of processing speed and energy efficiency on system level. By applying a virtual prototyping methodology with a proprietary filter module, we are able to investigate these two approaches within a state-of-the-art ARM based mobile phone platform at real software loads. Additionally, combined measurements of the execution time together with an estimation of the energy, that is consumed in the memory and the bus architecture, are performed. With reasonably dimensioned scratchpad memories (16 kB for instructions and data respectively), maximum speedups and energy savings both of approximately 60 % are achieved depending on the cache sizes in the embedded processor. Even better performance, especially in combination with big caches, is reached with a dedicated ROHCv2 hardware accelerator supporting the processing of several packets at once in a so called list mode. Compared to the pure software case, the execution time and the energy consumption are both improved by up to 80 % at small caches and still amount to more than 40 % and almost 30 % at big caches, respectively.

[1]  Gero Dittmann,et al.  Robust header compression (ROHC) in next-generation network processors , 2005, IEEE/ACM Transactions on Networking.

[2]  C.H. van Berkel,et al.  Multi-core for mobile phones , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[3]  S. Heinen,et al.  DATE 2007 "Best Industrial Designs" Session: From Algorithm to First 3.5G Call in Record Time - A Novel System Design Approach Based on Virtual Prototyping and its Consequences for Interdisciplinary System Design Teams , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.

[4]  Anas Showk,et al.  Joint Uplink and Downlink Performance Profiling of LTE Protocol Processing on a Mobile Platform , 2010, Int. J. Embed. Real Time Commun. Syst..

[5]  Sebastian Hessel,et al.  Acceleration of the L4/Fiasco microkernel using scratchpad memory , 2008, MobiVirt '08.

[6]  Carsten Bormann,et al.  RObust Header Compression (ROHC): Framework and four profiles: RTP, UDP, ESP, and uncompressed , 2001, RFC.

[7]  Peter Marwedel,et al.  Scratchpad memory: a design alternative for cache on-chip memory in embedded systems , 2002, Proceedings of the Tenth International Symposium on Hardware/Software Codesign. CODES 2002 (IEEE Cat. No.02TH8627).

[8]  Olli Silvén,et al.  Observations on Power-Efficiency Trends in Mobile Communication Devices , 2007, EURASIP J. Embed. Syst..

[9]  C. Carbonelli,et al.  On 3G LTE Terminal Implementation - Standard, Algorithms, Complexities and Challenges , 2008, 2008 International Wireless Communications and Mobile Computing Conference.

[10]  Sebastian Hessel,et al.  On the Design of a Suitable Hardware Platform for Protocol Stack Processing in LTE Terminals , 2009, 2009 International Conference on Computational Science and Engineering.

[11]  Daniel A. Connors,et al.  Analysis of hardware acceleration in reconfigurable embedded systems , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[12]  Anand Raghunathan,et al.  Power analysis of system-level on-chip communication architectures , 2004, International Conference on Hardware/Software Codesign and System Synthesis, 2004. CODES + ISSS 2004..

[13]  Frank Vahid,et al.  Dynamic hardware/software partitioning: a first approach , 2003, Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451).

[14]  Sebastian Hessel,et al.  An optimized parallel and energy-efficient implementation of SNOW 3G for LTE mobile devices , 2010, 2010 IEEE 12th International Conference on Communication Technology.

[15]  Jerzy W. Rozenblit,et al.  A new framework for power estimation of embedded systems , 2005, Computer.

[16]  Sebastian Hessel,et al.  Optimizing the Processing Performance of a Smart DMA Controller for LTE Terminals , 2010, 2010 IEEE 16th International Conference on Embedded and Real-Time Computing Systems and Applications.

[17]  Ghyslain Pelletier,et al.  RObust Header Compression Version 2 (ROHCv2): Profiles for RTP, UDP, IP, ESP and UDP-Lite , 2008, RFC.

[18]  Heonshik Shin,et al.  Dynamic scratchpad memory management for code in portable systems with an MMU , 2008, TECS.

[19]  Kees van Berkel,et al.  Multi-core for mobile phones , 2009, DATE.

[20]  Jan M. Rabaey,et al.  Low Power Design Essentials , 2009, Series on Integrated Circuits and Systems.

[21]  Anas Showk,et al.  Performance analysis of LTE protocol processing on an ARM based mobile platform , 2009, 2009 International Symposium on System-on-Chip.

[22]  Olli Silvén,et al.  Observations on Power-Efficiency Trends in Mobile Communication Devices , 2005, SAMOS.

[23]  Graham R. Hellestrand,et al.  Profiles in Power : Optimizing Real-Time Systems for Power , 2005 .