Power/Performance/Area Evaluations for Next-Generation HPC Processors using the A64FX Chip
暂无分享,去创建一个
Mitsuhisa Sato | Yuetsu Kodama | Miwako Tsuji | Tetsuya Odajima | Eishi Arima | Tetsuya Odajima | M. Sato | Miwako Tsuji | Eishi Arima | Yuetsu Kodama
[1] Niraj K. Jha,et al. McPAT-PVT: Delay and Power Modeling Framework for FinFET Processor Architectures Under PVT Variations , 2015, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[2] David A. Wood,et al. Adaptive cache compression for high-performance processors , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..
[3] Mitsuhisa Sato,et al. Accuracy Improvement of Memory System Simulation for Modern Shared Memory Processor , 2020, HPC Asia.
[4] Partha Pratim Pande,et al. Machine Learning for Design Space Exploration and Optimization of Manycore Systems , 2018, 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[5] Mohammad Alian,et al. dist-gem5: Distributed simulation of computer clusters , 2017, 2017 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[6] Mitsuhisa Sato,et al. Performance and power consumption analysis of Arm Scalable Vector Extension , 2020 .
[7] Sally A. McKee,et al. Efficiently exploring architectural design spaces via predictive modeling , 2006, ASPLOS XII.
[8] Y. Kodama,et al. Co-Design for A64FX Manycore Processor and ”Fugaku” , 2020, SC20: International Conference for High Performance Computing, Networking, Storage and Analysis.
[9] Jung Ho Ahn,et al. McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[10] Eishi Arima. Classification-Based Unified Cache Replacement via Partitioned Victim Address History , 2020, 2020 23rd Euromicro Conference on Digital System Design (DSD).
[11] Onur Mutlu,et al. A case for toggle-aware compression for GPU systems , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[12] George Ho,et al. PAPI: A Portable Interface to Hardware Performance Counters , 1999 .
[13] Mitsuhisa Sato,et al. Preliminary Performance Evaluation of Application Kernels Using ARM SVE with Multiple Vector Lengths , 2017, 2017 IEEE International Conference on Cluster Computing (CLUSTER).
[14] Gu-Yeon Wei,et al. Co-designing accelerators and SoC interfaces using gem5-Aladdin , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[15] Niraj K. Jha,et al. McPAT-Monolithic: An Area/Power/Timing Architecture Modeling Framework for 3-D Hybrid Monolithic Multicore Systems , 2020, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[16] Pugach Nataliya,et al. International roadmap for devices and systems. Cryogenic electronics and quantum information processing. 2018 Update , 2019 .
[17] David A. Wood,et al. gem5-gpu: A Heterogeneous CPU-GPU Simulator , 2015, IEEE Computer Architecture Letters.
[18] Diederik Verkest,et al. EMPIRE: Empirical power/area/timing models for register files , 2009, Microprocess. Microsystems.
[19] Hiroshi Nakamura,et al. Immediate sleep: Reducing energy impact of peripheral circuits in STT-MRAM caches , 2015, 2015 33rd IEEE International Conference on Computer Design (ICCD).