Specific read only data management for memory hierarchy optimization

The multiplication of the number of cores inside embedded systems has raised the pressure on the memory hierarchy. The cost of coherence protocol and the scalability problem of the memory hierarchy is nowadays a major issue. In this paper, a specific data management for read-only data is investigated because these data can be duplicated in several memories without being tracked. Based on analysis of standard benchmarks for embedded systems, this analysis shows that read-only data represent 62% of all the data used by applications and 18% of all the memory accesses. A specific data path for read-only data is then evaluated by using simulations. On the first level of the memory hierarchy, removing read-only data of the L1 cache and placing them in another read-only cache improve the data locality of the read-write data by 30% and decrease the total energy consumption of the first level memory by 5%.

[1]  Christoforos E. Kozyrakis,et al.  Comparing memory systems for chip multiprocessors , 2007, ISCA '07.

[2]  Kristof Beyls,et al.  Reuse Distance as a Metric for Cache Behavior. , 2001 .

[3]  Meikang Qiu,et al.  Data Placement and Duplication for Embedded Multicore Systems With Scratch Pad Memory , 2013, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[4]  Andres Charif Rubial,et al.  Performance Tuning of x86 OpenMP Codes with MAQAO , 2009, Parallel Tools Workshop.

[5]  Norman P. Jouppi,et al.  Architecting Efficient Interconnects for Large Caches with CACTI 6.0 , 2008, IEEE Micro.

[6]  Rajeev Barua,et al.  Heap data allocation to scratch-pad memory in embedded systems , 2005, J. Embed. Comput..

[7]  P. Glaskowsky NVIDIA ’ s Fermi : The First Complete GPU Computing Architecture , 2009 .

[8]  Peter J. Denning,et al.  Operating Systems Theory , 1973 .

[9]  Vincent J. Kruskal,et al.  LRU Stack Processing , 1975, IBM J. Res. Dev..

[10]  Luca Benini,et al.  An OpenMP Compiler for Efficient Use of Distributed Scratchpad Memory in MPSoCs , 2012, IEEE Transactions on Computers.

[11]  Aviral Shrivastava,et al.  A Software-Only Solution to Use Scratch Pads for Stack Data , 2009, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[12]  Mingyu Chen,et al.  DMA cache: Using on-chip storage to architecturally separate I/O data from CPU data for improving I/O performance , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.

[13]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[14]  Mahmut T. Kandemir,et al.  Dynamic Scratch-Pad Memory Management for Irregular Array Access Patterns , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[15]  Peter Marwedel,et al.  Scratchpad memory: a design alternative for cache on-chip memory in embedded systems , 2002, Proceedings of the Tenth International Symposium on Hardware/Software Codesign. CODES 2002 (IEEE Cat. No.02TH8627).

[16]  Shirley Dex,et al.  JR 旅客販売総合システム(マルス)における運用及び管理について , 1991 .