Cost aware cache replacement policy in shared last-level cache for hybrid memory based fog computing

ABSTRACT Fog computing requires a large main memory capacity to decrease latency and increase the Quality of Service (QoS). However, dynamic random access memory (DRAM), the commonly used random access memory, cannot be included into a fog computing system due to its high consumption of power. In recent years, non-volatile memories (NVM) such as Phase-Change Memory (PCM) and Spin-transfer torque RAM (STT-RAM) with their low power consumption have emerged to replace DRAM. Moreover, the currently proposed hybrid main memory, consisting of both DRAM and NVM, have shown promising advantages in terms of scalability and power consumption. However, the drawbacks of NVM, such as long read/write latency give rise to potential problems leading to asymmetric cache misses in the hybrid main memory. Current last level cache (LLC) policies are based on the unified miss cost, and result in poor performance in LLC and add to the cost of using NVM. In order to minimize the cache miss cost in the hybrid main memory, we propose a cost aware cache replacement policy (CACRP) that reduces the number of cache misses from NVM and improves the cache performance for a hybrid memory system. Experimental results show that our CACRP behaves better in LLC performance, improving performance up to 43.6% (15.5% on average) compared to LRU.

[1]  Christoforos E. Kozyrakis,et al.  Vantage: Scalable and efficient fine-grain cache partitioning , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[2]  Chung-Ping Chung,et al.  Set Utilization Based Dynamic Shared Cache Partitioning , 2011, 2011 IEEE 17th International Conference on Parallel and Distributed Systems.

[3]  Yiran Chen,et al.  Emerging non-volatile memories: Opportunities and challenges , 2011, 2011 Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[4]  Aamer Jaleel,et al.  Adaptive insertion policies for managing shared caches , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[5]  Dong Li,et al.  Identifying Opportunities for Byte-Addressable Non-Volatile Memory in Extreme-Scale Scientific Applications , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.

[6]  Tie Qiu,et al.  A task-efficient sink node based on embedded multi-core SoC for Internet of Things , 2016, Future Gener. Comput. Syst..

[7]  Onur Mutlu,et al.  Improving cache performance using read-write partitioning , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).

[8]  Aamer Jaleel,et al.  Adaptive insertion policies for high performance caching , 2007, ISCA '07.

[9]  Guangjie Han,et al.  Dynamic Resource Partitioning for Heterogeneous Multi-Core-Based Cloud Computing in Smart Cities , 2016, IEEE Access.

[10]  Chao Wang,et al.  Coordinate page allocation and thread group for improving main memory power efficiency , 2013, HotPower '13.

[11]  Gabriel H. Loh,et al.  PIPP: promotion/insertion pseudo-partitioning of multi-core shared caches , 2009, ISCA '09.

[12]  Onur Mutlu,et al.  The evicted-address filter: A unified mechanism to address both cache pollution and thrashing , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).

[13]  Y.C. Chen,et al.  Write Strategies for 2 and 4-bit Multi-Level Phase-Change Memory , 2007, 2007 IEEE International Electron Devices Meeting.

[14]  Jin Xiong,et al.  HAP: Hybrid-memory-Aware Partition in shared Last-Level Cache , 2014, 2014 IEEE 32nd International Conference on Computer Design (ICCD).

[15]  Li Liu,et al.  BRTCO: A Novel Boundary Recognition and Tracking Algorithm for Continuous Objects in Wireless Sensor Networks , 2018, IEEE Systems Journal.

[16]  Pedro López,et al.  Multi2Sim: A Simulation Framework to Evaluate Multicore-Multithreaded Processors , 2007, 19th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'07).

[17]  Keqiu Li,et al.  Heterogeneous ad hoc networks: Architectures, advances and challenges , 2017, Ad Hoc Networks.

[18]  Guangjie Han,et al.  PARS: A scheduling of periodically active rank to optimize power efficiency for main memory , 2015, J. Netw. Comput. Appl..

[19]  Guangjie Han,et al.  Dynamic Adaptive Replacement Policy in Shared Last-Level Cache of DRAM/PCM Hybrid Memory for Big Data Storage , 2017, IEEE Transactions on Industrial Informatics.

[20]  Aamer Jaleel,et al.  High performance cache replacement using re-reference interval prediction (RRIP) , 2010, ISCA.

[21]  Julio Sahuquillo,et al.  Multi2Sim: A Simulation Framework to Evaluate Multicore-Multithreaded Processors , 2007 .

[22]  Gabriel H. Loh,et al.  Double-DIP: Augmenting DIP with Adaptive Promotion Policies to Manage Shared L2 Caches , 2008 .

[23]  Shoji Sakamoto,et al.  An 8Mb multi-layered cross-point ReRAM macro with 443MB/s write throughput , 2012, 2012 IEEE International Solid-State Circuits Conference.

[24]  Guangjie Han,et al.  Two Novel DOA Estimation Approaches for Real-Time Assistant Calibration Systems in Future Vehicle Industrial , 2017, IEEE Systems Journal.

[25]  K. Mackay,et al.  Extended scalability and functionalities of MRAM based on thermally assisted writing , 2011, 2011 International Electron Devices Meeting.

[26]  S. Stall,et al.  Improving Cache Performance by Exploiting Read-Write Disparity , 2014 .

[27]  Guangjie Han,et al.  A grid-based joint routing and charging algorithm for industrial wireless rechargeable sensor networks , 2016, Comput. Networks.

[28]  John Turek,et al.  Optimal Partitioning of Cache Memory , 1992, IEEE Trans. Computers.

[29]  Guangjie Han,et al.  An Efficient Virtual Machine Consolidation Scheme for Multimedia Cloud Computing , 2016, Sensors.