Power-Performance Comparison of Single-Task Driven Many-Cores
暂无分享,去创建一个
[1] Yu Liu,et al. Scheduling for energy efficiency and fault tolerance in hard real-time systems , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).
[2] George C. Caragea,et al. Brief announcement: performance potential of an easy-to-program PRAM-on-chip prototype versus state-of-the-art processor , 2009, SPAA '09.
[3] Christoph W. Kessler,et al. Practical PRAM programming , 2000, Wiley series on parallel and distributed computing.
[4] Michael Garland,et al. Designing efficient sorting algorithms for manycore GPUs , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.
[5] Margaret Martonosi,et al. Runtime power monitoring in high-end processors: methodology and empirical data , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..
[6] Uzi Vishkin,et al. A pilot study to compare programming effort for two parallel programming models , 2007, J. Syst. Softw..
[7] Andrew B. Kahng,et al. ORION 2.0: A Power-Area Simulator for Interconnection Networks , 2012, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[8] Uzi Vishkin,et al. XMT-GPU: A PRAM Architecture for Graphics Computation , 2008, 2008 37th International Conference on Parallel Processing.
[9] George C. Caragea,et al. General-Purpose vs . GPU : Comparison of Many-Cores on Irregular Workloads , 2010 .
[10] Gang Qu,et al. Layout-Accurate Design and Implementation of a High-Throughput Interconnection Network for Single-Chip Parallel Processing , 2007 .
[11] Joseph JáJá,et al. An Introduction to Parallel Algorithms , 1992 .
[12] Sanguthevar Rajasekaran,et al. Handbook of Parallel Computing - Models, Algorithms and Applications , 2007 .
[13] Natalie D. Enright Jerger,et al. Outstanding Research Problems in NoC Design: System, Microarchitecture, and Circuit Perspectives , 2009, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[14] Kevin Skadron,et al. Many-core design from a thermal perspective , 2008, 2008 45th ACM/IEEE Design Automation Conference.
[15] William J. Dally,et al. The GPU Computing Era , 2010, IEEE Micro.
[16] Uzi Vishkin,et al. Using simple abstraction to reinvent computing for parallelism , 2011, Commun. ACM.
[17] Norman P. Jouppi,et al. CACTI: an enhanced cache access and cycle time model , 1996, IEEE J. Solid State Circuits.
[18] Kevin Skadron,et al. Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[19] Fuat Keceli,et al. Toolchain for Programming, Simulating and Studying the XMT Many-Core Architecture , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.
[20] Hyesoon Kim,et al. An integrated GPU power and performance model , 2010, ISCA.
[21] Uzi Vishkin,et al. Explicit multi-threading (XMT) bridging models for instruction parallelism (extended abstract) , 1998, SPAA '98.
[22] Jiang Zhu,et al. Building a RCP (Rate Control Protocol) Test Network , 2007 .
[23] Andrew A. Chien,et al. The future of microprocessors , 2011, Commun. ACM.
[24] Uzi Vishkin,et al. Towards a First Vertical Prototyping of an Extremely Fine-Grained Parallel Programming Approach , 2003, Theory of Computing Systems.
[25] S. Nassif,et al. Full chip leakage-estimation considering power supply and temperature variations , 2003, Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03..
[26] Margaret Martonosi,et al. Wattch: a framework for architectural-level power analysis and optimizations , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[27] Norman P. Jouppi,et al. CACTI 6.0: A Tool to Model Large Caches , 2009 .
[28] Uzi Vishkin,et al. Fpga-based prototype of a pram-on-chip processor , 2008, CF '08.
[29] Coniferous softwood. GENERAL TERMS , 2003 .
[30] George C. Caragea,et al. Models for Advancing PRAM and Other Algorithms into Parallel Programs for a PRAM-On-Chip Platform , 2006, Handbook of Parallel Computing.
[31] Ralph Grishman,et al. The NYU ultracomputer—designing a MIMD, shared-memory parallel machine , 2018, ISCA '98.
[32] Uzi Vishkin,et al. PRAM-on-chip: first commitment to silicon , 2007, SPAA '07.
[33] Uzi Vishkin,et al. Using Simple Abstraction to Guide the Reinvention of Computing for Parallelism , 2009 .
[34] Erik Lindholm,et al. NVIDIA Tesla: A Unified Graphics and Computing Architecture , 2008, IEEE Micro.
[35] Michael Garland,et al. Implementing sparse matrix-vector multiplication on throughput-oriented processors , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[36] Uzi Vishkin,et al. Is teaching parallel algorithmic thinking to high school students possible?: one teacher's experience , 2010, SIGCSE.
[37] Kevin Skadron,et al. Temperature-aware microarchitecture , 2003, ISCA '03.
[38] Aydin O. Balkan. Mesh-of-Trees Interconnection Network for an Explicitly Multi-Threaded Parallel Computer Architecture , 2008 .
[39] Jung Ho Ahn,et al. McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[40] Ralph Grishman,et al. The NYU Ultracomputer—Designing an MIMD Shared Memory Parallel Computer , 1983, IEEE Transactions on Computers.