Optimal Application Mapping and Scheduling for Network-on-Chips with Computation in STT-RAM Based Router
暂无分享,去创建一个
Lei Yang | Nikil Dutt | Weichen Liu | Nan Guan | Nan Guan | Weichen Liu | N. Dutt | Lei Yang
[1] Yuan Xie,et al. Hybrid Drowsy SRAM and STT-RAM Buffer Designs for Dark-Silicon-Aware NoC , 2016, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[2] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[3] Yuankun Xue,et al. User Cooperation Network Coding Approach for NoC Performance Improvement , 2015, NOCS.
[4] Turbo Majumder,et al. NoC router using STT-MRAM based hybrid buffers with error correction and limited flit retransmission , 2015, 2015 IEEE International Symposium on Circuits and Systems (ISCAS).
[5] Cong Xu,et al. Pinatubo: A processing-in-memory architecture for bulk bitwise operations in emerging non-volatile memories , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).
[6] Engin Ipek,et al. Resistive computation: avoiding the power wall with low-leakage, STT-MRAM based computing , 2010, ISCA.
[7] Rajesh Gupta,et al. Network topology exploration of mesh-based coarse-grain reconfigurable architectures , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.
[8] Mircea R. Stan,et al. Relaxing non-volatility for fast and energy-efficient STT-RAM caches , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.
[9] Tejas Karkhanis,et al. Active Memory Cube: A processing-in-memory architecture for exascale systems , 2015, IBM J. Res. Dev..
[10] Luan Tran,et al. 45nm low power CMOS logic compatible embedded STT MRAM utilizing a reverse-connection 1T/1MTJ cell , 2009, 2009 IEEE International Electron Devices Meeting (IEDM).
[11] Tao Zhang,et al. PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-Based Main Memory , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[12] Norman P. Jouppi,et al. CACTI: an enhanced cache access and cycle time model , 1996, IEEE J. Solid State Circuits.
[13] Radu Marculescu,et al. An efficient Network-on-Chip (NoC) based multicore platform for hierarchical parallel genetic algorithms , 2014, 2014 Eighth IEEE/ACM International Symposium on Networks-on-Chip (NoCS).
[14] Ki Hwan Yum,et al. A Hybrid Buffer Design with STT-MRAM for On-Chip Interconnects , 2012, 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip.
[15] Jason Cong,et al. Minimizing Computation in Convolutional Neural Networks , 2014, ICANN.
[16] Onur Mutlu,et al. A case for bufferless routing in on-chip networks , 2009, ISCA '09.
[17] Jun Yang,et al. A durable and energy efficient main memory using phase change memory technology , 2009, ISCA '09.
[18] Shahin Nazarian,et al. Prometheus: Processing-in-memory heterogeneous architecture design from a multi-layer network theoretic strategy , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[19] Edwin Hsing-Mean Sha,et al. Optimal functional unit assignment and voltage selection for pipelined MPSoC with guaranteed probability on time performance , 2017, LCTES.
[20] Mohamed El-Sayed Ragab,et al. Flexible router architecture for network-on-chip , 2012, Comput. Math. Appl..
[21] An-Yeu Wu,et al. Path-Congestion-Aware Adaptive Routing With a Contention Prediction Scheme for Network-on-Chip Systems , 2014, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[22] Chao Chen,et al. Hardware-software collaboration for dark silicon heterogeneous many-core systems , 2017, Future Gener. Comput. Syst..
[23] Rami G. Melhem,et al. Domain-wall memory buffer for low-energy NoCs , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).
[24] Cong Xu,et al. NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory , 2012, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[25] Wei Zhang,et al. Distributed Sensor Network-on-Chip for Performance Optimization of Soft-Error-Tolerant Multiprocessor System-on-Chip , 2016, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[26] Wei Zhang,et al. Thermal-Aware Task Mapping on Dynamically Reconfigurable Network-on-Chip Based Multiprocessor System-on-Chip , 2018, IEEE Transactions on Computers.
[27] Yuankun Xue,et al. Improving NoC performance under spatio-temporal variability by runtime reconfiguration: a general mathematical framework , 2016, 2016 Tenth IEEE/ACM International Symposium on Networks-on-Chip (NOCS).
[28] Edwin Hsing-Mean Sha,et al. Optimal functional-unit assignment and buffer placement for probabilistic pipelines , 2016, 2016 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).
[29] George Michelogiannakis,et al. Elastic-buffer flow control for on-chip networks , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.
[30] Anand Raghunathan,et al. Computing in Memory With Spin-Transfer Torque Magnetic RAM , 2017, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[31] Edwin Hsing-Mean Sha,et al. FoToNoC: A Folded Torus-Like Network-on-Chip Based Many-Core Systems-on-Chip in the Dark Silicon Era , 2017, IEEE Transactions on Parallel and Distributed Systems.
[32] Wenqing Wu,et al. Multi retention level STT-RAM cache designs with a dynamic refresh scheme , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[33] Huawei Li,et al. ProPRAM: Exploiting the transparent logic resources in Non-Volatile Memory for Near Data Computing , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).
[34] Shahin Nazarian,et al. A load balancing inspired optimization framework for exascale multicore systems: A complex networks approach , 2017, 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[35] Wei Zhang,et al. Traffic-Aware Application Mapping for Network-on-Chip Based Multiprocessor System-on-Chip , 2015, 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conference on Embedded Software and Systems.
[36] Marios C. Papaefthymiou,et al. Computational sprinting , 2012, IEEE International Symposium on High-Performance Comp Architecture.
[37] Seung H. Kang,et al. A 45nm 1Mb embedded STT-MRAM with design techniques to minimize read-disturbance , 2011, 2011 Symposium on VLSI Circuits - Digest of Technical Papers.
[38] Lei Zhou,et al. Optimal Functional-Unit Assignment for Heterogeneous Systems Under Timing Constraint , 2017, IEEE Transactions on Parallel and Distributed Systems.
[39] Edwin Hsing-Mean Sha,et al. Application Mapping and Scheduling for Network-on-Chip-Based Multiprocessor System-on-Chip With Fine-Grain Communication Optimization , 2016, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[40] Chita R. Das,et al. Architecting on-chip interconnects for stacked 3D STT-RAM caches in CMPs , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).
[41] Peng Chen,et al. Task mapping on SMART NoC: Contention matters, not the distance , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).
[42] Edwin Hsing-Mean Sha,et al. On the Design of Minimal-Cost Pipeline Systems Satisfying Hard/Soft Real-Time Constraints , 2021, IEEE Transactions on Emerging Topics in Computing.
[43] Nectarios Koziris,et al. An efficient algorithm for the physical mapping of clustered task graphs onto multiprocessor architectures , 2000, Proceedings 8th Euromicro Workshop on Parallel and Distributed Processing.
[44] Christian Bienia,et al. PARSEC 2.0: A New Benchmark Suite for Chip-Multiprocessors , 2009 .