Gables: A Roofline Model for Mobile SoCs
暂无分享,去创建一个
[1] Uri C. Weiser,et al. MultiAmdahl: How Should I Divide My Heterogenous Chip? , 2012, IEEE Computer Architecture Letters.
[2] Norman P. Jouppi,et al. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.
[3] Willie Anderson,et al. Hexagon DSP: An Architecture Optimized for Mobile Multimedia and Communications , 2014, IEEE Micro.
[4] Samuel Williams,et al. Roofline Model Toolkit: A Practical Tool for Architectural and Program Analysis , 2014, PMBS@SC.
[5] Vijay Janapa Reddi,et al. Mobile CPU's rise to power: Quantifying the impact of generational mobile CPU design trends on performance, energy, and user satisfaction , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[6] Avi Mendelson,et al. Many-Core vs. Many-Thread Machines: Stay Away From the Valley , 2009, IEEE Computer Architecture Letters.
[7] Mark D. Hill,et al. Amdahl's Law in the Multicore Era , 2008 .
[8] Edward D. Lazowska,et al. Quantitative system performance - computer system analysis using queueing network models , 1984, Int. CMG Conference.
[9] Christina Delimitrou,et al. Amdahl's law for tail latency , 2018, Commun. ACM.
[10] G. Amdhal,et al. Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).
[11] L. Dagum,et al. OpenMP: an industry standard API for shared-memory programming , 1998 .
[12] Vijay Janapa Reddi,et al. Two Billion Devices and Counting , 2018, IEEE Micro.
[13] John L. Gustafson,et al. Reevaluating Amdahl's law , 1988, CACM.
[14] Carl Staelin,et al. lmbench: Portable Tools for Performance Analysis , 1996, USENIX Annual Technical Conference.
[15] Pradeep Dubey,et al. Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU , 2010, ISCA.
[16] David A. Wood,et al. LogCA: A high-level performance model for hardware accelerators , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[17] Samuel Williams,et al. Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.
[18] Alan Jay Smith,et al. Evaluating Associativity in CPU Caches , 1989, IEEE Trans. Computers.
[19] Gu-Yeon Wei,et al. The Aladdin Approach to Accelerator Design and Modeling , 2015, IEEE Micro.
[20] Mahmut T. Kandemir,et al. Anatomy of GPU Memory System for Multi-Application Execution , 2015, MEMSYS.
[21] Anna Gerber,et al. Opengl Programming Guide The Official Guide To Learning Opengl Versions 3 0 And 3 1 , 2016 .