Split array and scalar data caches: a comprehensive study of data cache organization
暂无分享,去创建一个
[1] Sally A. McKee,et al. Smarter Memory: Improving Bandwidth for Streamed References , 1998, Computer.
[2] Norman P. Jouppi,et al. CACTI: an enhanced cache access and cycle time model , 1996, IEEE J. Solid State Circuits.
[3] David H. Albonesi,et al. Selective cache ways: on-demand cache resource allocation , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.
[4] Wen-mei W. Hwu,et al. Run-time Adaptive Cache Hierarchy Via Reference Analysis , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[5] Afrin Naz,et al. A Study of Separate Array and Scalar Caches , 2004, HPCS.
[6] Chandra Krintz,et al. Cache-conscious data placement , 1998, ASPLOS VIII.
[7] Monica S. Lam,et al. A data locality optimizing algorithm , 1991, PLDI '91.
[8] Edward S. Davidson,et al. Reducing conflicts in direct-mapped caches with a temporality-based design , 1996, Proceedings of the 1996 ICPP Workshop on Challenges for Parallel Processing.
[9] Charles C. Weems,et al. Application-adaptive intelligent cache memory system , 2002, TECS.
[10] Kimming So,et al. Cache Operations by MRU Change , 1988, IEEE Trans. Computers.
[11] James R. Larus,et al. Cache-conscious structure definition , 1999, PLDI '99.
[12] Todd M. Austin,et al. The SimpleScalar tool set, version 2.0 , 1997, CARN.
[13] Edward McLellan. The Alpha AXP architecture and 21064 processor , 1993, IEEE Micro.
[14] Mateo Valero,et al. A victim cache for vector registers , 1997, ICS '97.
[15] Anne Rogers,et al. Supporting dynamic data structures on distributed-memory machines , 1995, TOPL.
[16] Norman P. Jouppi,et al. Reconfigurable caches and their application to media processing , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[17] Alan Jay Smith,et al. Cache Memories , 1982, CSUR.
[18] Véronique Benzaken,et al. Enhancing Performance in a Persistent Object Store: Clustering Strategies in O2 , 1990, POS.
[19] Frank Vahid,et al. A highly configurable cache architecture for embedded systems , 2003, 30th Annual International Symposium on Computer Architecture, 2003. Proceedings..
[20] Richard E. Kessler,et al. Evaluating stream buffers as a secondary cache replacement , 1994, Proceedings of 21 International Symposium on Computer Architecture.
[21] Todd C. Mowry,et al. Compiler-based prefetching for recursive data structures , 1996, ASPLOS VII.
[22] Tack-Don Han,et al. A Power Efficient Cache Structure for Embedded Processors Based on the Dual Cache Structure , 2000, LCTES.
[23] Norman P. Jouppi,et al. Improving direct-mapped cache performance by the addition of a small fully-associative cache and pre , 1990, ISCA 1990.
[24] Krishna M. Kavi,et al. Design of cache memories for dataflow architecture , 1998, J. Syst. Archit..
[25] Mateo Valero,et al. Software management of selective and dual data caches , 1997 .
[26] A. Argawal,et al. Cache performance of operating systems and multiprogramming , 1988 .
[27] Todd C. Mowry,et al. Memory forwarding: enabling aggressive layout optimizations by guaranteeing the safety of data relocation , 1999, ISCA.
[28] Anant Agarwal,et al. Column-associative caches: a technique for reducing the miss rate of direct-mapped caches , 1993, ISCA '93.
[29] Kanad Ghose,et al. Energy-efficiency of VLSI caches: a comparative study , 1997, Proceedings Tenth International Conference on VLSI Design.
[30] Peter Petrov,et al. Towards effective embedded processors in codesigns: customizable partitioned caches , 2001, Ninth International Symposium on Hardware/Software Codesign. CODES 2001 (IEEE Cat. No.01TH8571).
[31] Jim D. Garside,et al. An asynchronous victim cache , 2002, Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools.
[32] Trevor Mudge,et al. MiBench: A free, commercially representative embedded benchmark suite , 2001 .
[33] Hugo De Man,et al. Cache conscious data layout organization for conflict miss reduction in embedded multimedia applications , 2005, IEEE Transactions on Computers.
[34] Antonio Gonzalez,et al. A data cache with multiple caching strategies tuned to different types of locality , 1995, International Conference on Supercomputing.
[35] Krishna M. Kavi,et al. Cache Performance of Scheduled Dataflow Architecture , 2000 .
[36] Douglas Comer,et al. Ubiquitous B-Tree , 1979, CSUR.
[37] Afrin Naz,et al. A Study of Reconfigurable Split Data Caches and Instruction Caches , 2006, PDCS.
[38] Jang-Soo Lee,et al. A new cache architecture based on temporal and spatial locality , 2000, J. Syst. Archit..
[39] Frank Vahid,et al. Using a victim buffer in an application-specific memory hierarchy , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.
[40] Bruce Jacob,et al. Cache Design for Embedded Real-Time Systems , 1999 .
[41] Jörg Henkel,et al. Interface and cache power exploration for core-based embedded system design , 1999, ICCAD 1999.
[42] Paul R. Wilson,et al. Object Type Directed Garbage Collection To Improve Locality , 1992, IWMM.
[43] Krishna M. Kavi,et al. Performance enhancement by eliminating redundant function execution , 2006, 39th Annual Simulation Symposium (ANSS'06).
[44] Rajeev Balasubramonian,et al. Memory hierarchy reconfiguration for energy and performance in general-purpose processor architectures , 2000, MICRO 33.
[45] David A. Patterson,et al. Computer Architecture: A Quantitative Approach , 1969 .
[46] Amitabh Srivastava,et al. Analysis Tools , 2019, Public Transportation Systems.
[47] Frank Vahid,et al. Synthesis of customized loop caches for core-based embedded systems , 2002, ICCAD 2002.
[48] Nikil D. Dutt,et al. Automatic tuning of two-level caches to embedded applications , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.
[49] Dirk Grunwald,et al. A comparison of software code reordering and victim buffers , 1999, CARN.
[50] James E. Smith,et al. The predictability of data values , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[51] Kaushik Roy,et al. DRG-cache: a data retention gated-ground cache for low power , 2002, DAC '02.
[52] Chris J. Cheney. A nonrecursive list compacting algorithm , 1970, Commun. ACM.
[53] Dirk Grunwald,et al. Predictive sequential associative cache , 1996, Proceedings. Second International Symposium on High-Performance Computer Architecture.
[54] Chau-Wen Tseng,et al. Compiler optimizations for improving data locality , 1994, ASPLOS VI.
[55] Krishna M. Kavi,et al. Design of cache memories for multi-threaded dataflow architecture , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[56] Afrin Naz,et al. Making a case for split data caches for embedded applications , 2006, SIGARCH Comput. Archit. News.
[57] Srinivas Devadas,et al. Software-assisted cache replacement mechanisms for embedded systems , 2001, IEEE/ACM International Conference on Computer Aided Design. ICCAD 2001. IEEE/ACM Digest of Technical Papers (Cat. No.01CH37281).
[58] Jean-Loup Baer,et al. An effective on-chip preloading scheme to reduce data access penalty , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).
[59] Afrin Naz,et al. Improving data cache performance with integrated use of split caches, victim cache and stream buffers , 2005, SIGARCH Comput. Archit. News.
[60] R. Rajamani,et al. A CMOS RISC CPU with on-chip parallel cache , 1994, Proceedings of IEEE International Solid-State Circuits Conference - ISSCC '94.
[61] J. Banerjee,et al. Clustering a DAG for CAD Databases , 1988, IEEE Trans. Software Eng..
[62] Wen-mei W. Hwu,et al. Run-time spatial locality detection and optimization , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[63] John L. Henning. SPEC CPU2000: Measuring CPU Performance in the New Millennium , 2000, Computer.
[64] Kanad Ghose,et al. Analytical energy dissipation models for low-power caches , 1997, ISLPED '97.
[65] Matthew L. Seidl,et al. Segregating heap objects by reference behavior and lifetime , 1998, ASPLOS VIII.
[66] David A. Moon,et al. Garbage collection in a large LISP system , 1984, LFP '84.
[67] Israel Koren,et al. The minimax cache: an energy-efficient framework for media processors , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.
[68] Frank Vahid,et al. Energy benefits of a configurable line size cache for embedded systems , 2003, IEEE Computer Society Annual Symposium on VLSI, 2003. Proceedings..
[69] James R. Larus,et al. Using generational garbage collection to implement cache-conscious data placement , 1998, ISMM '98.
[70] Lee Jung-Hoon,et al. An energy efficient cache memory architecture for embedded systems , 2004, SAC '04.
[71] Ken Chan,et al. PA7200: a PA-RISC processor with integrated high performance MP bus interface , 1994, Proceedings of COMPCON '94.
[72] G.S. Sohi,et al. Dynamic instruction reuse , 1997, ISCA '97.