Quasistatic shared libraries and XIP for memory footprint reduction in MMU-less embedded systems

Despite a rapid decrease in the price of solid state memory devices, system memory is still a very precious resource in embedded systems. The use of shared libraries and execution-in-place (XIP) is known to be effective in significantly reducing memory usage. Unfortunately, many resource-constrained embedded systems lack an MMU, making it extremely difficult to support these techniques. To address this problem, we propose a novel shared library technique called a quasi-static shared library and an XIP, both based on our enhanced position independent code technique. In our quasistatic shared libraries, global symbols are bound to pseudoaddresses at linking time and actual physical addresses are bound at loading time. Unlike conventional shared libraries, they do not require symbol tables that take up valuable memory space and, therefore, allow for expedited address translation at runtime. Our XIP technique is facilitated by our enhanced position independent code where a data section can be arbitrarily located. Both the shared library and XIP techniques are made possible by emulating an MMU's memory mapping feature with a data section base register (DSBR) and a data section base table (DSBT). We have implemented these proposed techniques in a commercial ADSL (Asymmetric Digital Subscriber Line) home network gateway equipped with an MMU-less ARM7TDMI processor core, 2MB flash memory, and 16MB RAM. We measured its memory usage and evaluated its performance overhead by conducting a series of experiments. These experiments clearly demonstrate the effectiveness of our techniques in reducing memory usage. The results are impressive: 35% reduction in flash memory usage when using only the shared library and 30% reduction in RAM usage when using the shared library and XIP together. These results were achieved with only a negligible performance penalty of less than 4%. Even though these techniques were applied to uClinux-based embedded systems, they can be used for any MMU-less real-time operating system.

[1]  Chanik Park,et al.  A low-cost memory architecture with NAND XIP for mobile embedded systems , 2003, First IEEE/ACM/IFIP International Conference on Hardware/ Software Codesign and Systems Synthesis (IEEE Cat. No.03TH8721).

[2]  Luigi Rizzo A very fast algorithm for RAM compression , 1997, OPSR.

[3]  Scott Mahlke,et al.  Scalar program performance on multiple-instruction-issue processors with a limited number of registers , 1992, Proceedings of the Twenty-Fifth Hawaii International Conference on System Sciences.

[4]  Rajeev Barua,et al.  Memory overflow protection for embedded systems using run-time checks, reuse, and compression , 2006, TECS.

[5]  Dharmendra S. Modha,et al.  CAR: Clock with Adaptive Replacement , 2004, FAST.

[6]  Luca Benini,et al.  Hardware-assisted data compression for energy minimization in systems with embedded processors , 2002, Proceedings 2002 Design, Automation and Test in Europe Conference and Exhibition.

[7]  P DickRobert,et al.  Online memory compression for embedded systems , 2010 .

[8]  Richard M. Stallman,et al.  Using and Porting the GNU Compiler Collection , 2000 .

[9]  Matt Pietrek,et al.  An in-depth look into the win32 portable executable le format , 2002 .

[10]  Toni Cortes,et al.  Swap compression: resurrecting old ideas , 2000, Softw. Pract. Exp..

[11]  Milos Prvulovic,et al.  Improving system performance with compressed memory , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.

[12]  Seongsoo Hong,et al.  Memory Footprint Reduction with Quasi-Static Shared Libraries in MMU-less Embedded Systems , 2006, 12th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS'06).

[13]  Sang Lyul Min,et al.  On the existence of a spectrum of policies that subsumes the least recently used (LRU) and least frequently used (LFU) policies , 1999, SIGMETRICS '99.

[14]  Elliott I. Organick,et al.  The multics system: an examination of its structure , 1972 .

[15]  David M. Beazley,et al.  The inside story on shared libraries and dynamic loading , 2001, Comput. Sci. Eng..

[16]  Nicholas Wells,et al.  BusyBox: A Swiss Army Knife for Linux , 2000 .

[17]  Yannis Smaragdakis,et al.  EELRU: simple and effective adaptive page replacement , 1999, SIGMETRICS '99.

[18]  Sang Lyul Min,et al.  Design, Implementation, and Performance Evaluation of a Detection-Based Adaptive Block Replacement Scheme , 2002, IEEE Trans. Computers.

[19]  Elliott I. Organick,et al.  The Multics system , 1972 .

[20]  Daniel L. Murphy Storage organization and management in TENEX , 1972, AFIPS '72 (Fall, part I).

[21]  Jörg Henkel,et al.  Code compression for low power embedded system design , 2000, Proceedings 37th Design Automation Conference.

[22]  Michael L. Scott,et al.  Programming Language Pragmatics , 1999 .

[23]  Nimrod Megiddo,et al.  ARC: A Self-Tuning, Low Overhead Replacement Cache , 2003, FAST.

[24]  John A. Bingham ADSL, VDSL, and Multicarrier Modulation: Wiley Series in Telecommunications and Signal Processing , 2000 .

[25]  Lei Yang,et al.  Automated compile-time and run-time techniques to increase usable memory in MMU-less embedded systems , 2006, CASES '06.

[26]  Toni Cortes,et al.  Swap compression: resurrecting old ideas , 2000 .

[27]  Robert A. Gingell,et al.  Shared Libraries in SunOS , 1987 .

[28]  Tony Givargis,et al.  Software Virtual Memory Management for MMU-less Embedded Systems , 2005 .

[29]  Sang Lyul Min,et al.  Compiler-assisted demand paging for embedded systems with flash memory , 2004, EMSOFT '04.

[30]  Michael L. Scott,et al.  Programming Language Pragmatics (3. ed.) , 2006 .

[31]  Song Jiang,et al.  CLOCK-Pro: An Effective Improvement of the CLOCK Replacement , 2005, USENIX ATC, General Track.

[32]  David McCullough uCLinux for Linux programmers , 2004 .

[33]  Richard Earnshaw Procedure Call Standard for the ARM ® Architecture , 2006 .

[34]  John A. C. Bingham ADSL, VDSL, and Multicarrier Modulation , 2000 .

[35]  S. R. Jones,et al.  High performance code compression architecture for the embedded ARM/THUMB processor , 2004, CF '04.

[36]  Michael E. Wazlowski,et al.  IBM Memory Expansion Technology (MXT) , 2001, IBM J. Res. Dev..

[37]  Lei Yang,et al.  Online memory compression for embedded systems , 2010, TECS.

[38]  John R. White,et al.  Linkers and Loaders , 1972, CSUR.