CIACP: A Correlation- and Iteration- Aware Cache Partitioning Mechanism to Improve Performance of Multiple Coarse-Grained Reconfigurable Arrays
暂无分享,去创建一个
Leibo Liu | Shouyi Yin | Shaojun Wei | Kai Luo | Chen Yang | Chen Yang | Leibo Liu | Shaojun Wei | Shouyi Yin | Kai Luo
[1] Roberto Guerrieri,et al. Application Space Exploration of a Heterogeneous Run-Time Configurable Digital Signal Processor , 2013, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[2] Zhiyi Yu,et al. Low-Power Multicore Processor Design With Reconfigurable Same-Instruction Multiple Process , 2014, IEEE Transactions on Circuits and Systems II: Express Briefs.
[3] Gabriel H. Loh,et al. PIPP: promotion/insertion pseudo-partitioning of multi-core shared caches , 2009, ISCA '09.
[4] Vijay S. Pai,et al. Imbalanced cache partitioning for balanced data-parallel programs , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[5] Chenjie Yu,et al. Off-chip memory bandwidth minimization through cache partitioning for multi-core platforms , 2010, Design Automation Conference.
[6] Roberto Guerrieri,et al. A Heterogeneous Digital Signal Processor for Dynamically Reconfigurable Computing , 2010, IEEE Journal of Solid-State Circuits.
[7] N. Voros,et al. Dynamic System Reconfiguration in Heterogeneous Platforms , 2009 .
[8] Jari Nurmi,et al. Design of an accelerator-rich architecture by integrating multiple heterogeneous coarse grain reconfigurable arrays over a network-on-chip , 2014, 2014 IEEE 25th International Conference on Application-Specific Systems, Architectures and Processors.
[9] Karthikeyan Sankaralingam,et al. DySER: Unifying Functionality and Parallelism Specialization for Energy-Efficient Computing , 2012, IEEE Micro.
[10] Leibo Liu,et al. Polyhedral model based mapping optimization of loop nests for CGRAs , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).
[11] Yale N. Patt,et al. Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[12] Yosi Ben-Asher,et al. Overlapping memory operations with circuit evaluation in reconfigurable computing , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..
[13] Rudy Lauwereins,et al. A Coarse-Grained Array Accelerator for Software-Defined Radio Baseband Processing , 2008, IEEE Micro.
[14] Gerard J. M. Smit,et al. Towards Software Defined Radios Using Coarse-Grained Reconfigurable Hardware , 2008, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[15] Jason Cong,et al. Architecture support for accelerator-rich CMPs , 2012, DAC Design Automation Conference 2012.
[16] Wojciech Czaja,et al. A case study on data fusion with overlapping segments , 2013, 2013 IEEE Applied Imagery Pattern Recognition Workshop (AIPR).
[17] Kevin Skadron,et al. A performance study of general-purpose applications on graphics processors using CUDA , 2008, J. Parallel Distributed Comput..
[18] Lizhong Chen,et al. Futility Scaling: High-Associativity Cache Partitioning , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[19] John Turek,et al. Optimal Partitioning of Cache Memory , 1992, IEEE Trans. Computers.
[20] Daniel Sánchez,et al. Talus: A simple way to remove cliffs in cache performance , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).
[21] Jianbin Fang,et al. A Comprehensive Performance Comparison of CUDA and OpenCL , 2011, 2011 International Conference on Parallel Processing.
[22] Samuel Williams,et al. The Landscape of Parallel Computing Research: A View from Berkeley , 2006 .
[23] Luca Benini,et al. Platform 2012, a many-core computing accelerator for embedded SoCs: Performance evaluation of visual analytics applications , 2012, DAC Design Automation Conference 2012.
[24] Aviral Shrivastava,et al. Enabling Multithreading on CGRAs , 2011, 2011 International Conference on Parallel Processing.
[25] Victor Y. Chen,et al. SimRPU: A Simulation Environment for Reconfigurable Architecture Exploration , 2014, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[26] Ching-Wen Chen,et al. Multiple Channels with Overlapping Data Sub-Channel Method for Mobile Ad Hoc Networks , 2007, 2007 IEEE Wireless Communications and Networking Conference.
[27] Dong Wang,et al. An energy-efficient coarse-grained dynamically reconfigurable fabric for multiple-standard video decoding applications , 2013, Proceedings of the IEEE 2013 Custom Integrated Circuits Conference.
[28] David A. Patterson,et al. A hardware evaluation of cache partitioning to improve utilization and energy-efficiency while preserving responsiveness , 2013, ISCA.
[29] Christoforos E. Kozyrakis,et al. Vantage: Scalable and efficient fine-grain cache partitioning , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).
[30] Hiroshi Nakamura,et al. Dynamic power control with a heterogeneous multi-core system using a 3-D wireless inductive coupling interconnect , 2012, 2012 International Conference on Field-Programmable Technology.
[31] Daniel P. Siewiorek,et al. A resource allocation model for QoS management , 1997, Proceedings Real-Time Systems Symposium.
[32] Eric Rotenberg,et al. Jigsaw: Scalable software-defined caches , 2013, Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques.
[33] Leibo Liu,et al. Acceleration of control flows on Reconfigurable Architecture with a composite method , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).
[34] Tulika Mitra,et al. Heterogeneous Multi-core Architectures , 2015, IPSJ Trans. Syst. LSI Des. Methodol..
[35] Sujit Dey,et al. Variation aware cache partitioning for multithreaded programs , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).
[36] Fadi J. Kurdahi,et al. A framework for reconfigurable computing: task scheduling and context management , 2001, IEEE Trans. Very Large Scale Integr. Syst..
[37] Jack J. Dongarra,et al. L2 Cache Modeling for Scientific Applications on Chip Multi-Processors , 2007, 2007 International Conference on Parallel Processing (ICPP 2007).
[38] Paolo Ienne,et al. Elastic CGRAs , 2013, FPGA '13.
[39] R. Govindarajan,et al. Probabilistic Shared Cache Management (PriSM) , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).
[40] Russell Tessier,et al. Reconfigurable Computing Architectures , 2015, Proceedings of the IEEE.
[41] Young-Hwan Park,et al. Software-defined DVT-T2 demodulator using scalable DSP processors , 2013, IEEE Transactions on Consumer Electronics.
[42] Dong Wang,et al. An Energy-Efficient Coarse-Grained Reconfigurable Processing Unit for Multiple-Standard Video Decoding , 2015, IEEE Transactions on Multimedia.
[43] Abdullah Atalar,et al. BilRC: An Execution Triggered Coarse Grained Reconfigurable Architecture , 2013, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[44] Jichuan Chang,et al. Cooperative cache partitioning for chip multiprocessors , 2007, ICS '07.
[45] Jason Cong,et al. Accelerator-rich architectures: Opportunities and progresses , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).
[46] Aamer Jaleel,et al. High performance cache replacement using re-reference interval prediction (RRIP) , 2010, ISCA.
[47] Daniel Sánchez,et al. Ubik: efficient cache sharing with strict qos for latency-critical workloads , 2014, ASPLOS.
[48] Leibo Liu,et al. On-Chip Memory Hierarchy in One Coarse-Grained Reconfigurable Architecture to Compress Memory Space and to Reduce Reconfiguration Time and Data-Reference Time , 2014, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[49] Nikolaos S. Voros,et al. Dynamic System Reconfiguration in Heterogeneous Platforms , 2009 .
[50] Daniel Sánchez,et al. Scaling distributed cache hierarchies through computation and data co-scheduling , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).
[51] James Demmel,et al. the Parallel Computing Landscape , 2022 .