AMNESIAC : Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency ∗
暂无分享,去创建一个
[1] Mark Horowitz,et al. Energy dissipation in general purpose microprocessors , 1996, IEEE J. Solid State Circuits.
[2] William J. Dally,et al. GPUs and the Future of Parallel Computing , 2011, IEEE Micro.
[3] David M. Brooks,et al. Energy characterization and instruction-level energy model of Intel's Xeon Phi processor , 2013, International Symposium on Low Power Electronics and Design (ISLPED).
[4] Mahmut T. Kandemir,et al. Reducing Off-Chip Memory Access Costs Using Data Recomputation in Embedded Chip Multi-processors , 2007, 2007 44th ACM/IEEE Design Automation Conference.
[5] Seung-Moon Yoo,et al. FlexRAM: toward an advanced intelligent memory system , 1999, Proceedings 1999 IEEE International Conference on Computer Design: VLSI in Computers and Processors (Cat. No.99CB37040).
[6] Mahmut T. Kandemir,et al. Studying storage-recomputation tradeoffs in memory-constrained embedded processing , 2005, Design, Automation and Test in Europe.
[7] Harold S. Stone,et al. A Logic-in-Memory Computer , 1970, IEEE Transactions on Computers.
[8] Lieven Eeckhout,et al. Sniper: Exploring the level of abstraction for scalable and accurate parallel multi-core simulation , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[9] Anoop Gupta,et al. Design and evaluation of a compiler algorithm for prefetching , 1992, ASPLOS V.
[10] William J. Dally,et al. A bandwidth-efficient architecture for media processing , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.
[11] Craig Zilles,et al. Execution-based prediction using speculative slices , 2001, ISCA 2001.
[12] P.M. Kogge,et al. Pursuing a petaflop: point designs for 100 TF computers using PIM technologies , 1996, Proceedings of 6th Symposium on the Frontiers of Massively Parallel Computation (Frontiers '96).
[13] Mahmut T. Kandemir,et al. Minimizing Energy Consumption of Banked Memories Using Data Recomputation , 2006, ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design.
[14] M. Martonosi,et al. Timekeeping in the memory system: predicting and optimizing memory behavior , 2002, Proceedings 29th Annual International Symposium on Computer Architecture.
[15] Mario Badr,et al. Load Value Approximation , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[16] G.S. Sohi,et al. Dynamic instruction reuse , 1997, ISCA '97.
[17] Kai Li,et al. The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[18] Kevin Skadron,et al. Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[19] Eric Rotenberg,et al. Slipstream processors: improving both performance and fault tolerance , 2000, SIGP.
[20] Engin Ipek,et al. Resistive computation: avoiding the power wall with low-leakage, STT-MRAM based computing , 2010, ISCA.
[21] Mikko H. Lipasti,et al. Value locality and load value prediction , 1996, ASPLOS VII.
[22] John Paul Shen,et al. Speculative precomputation: long-range prefetching of delinquent loads , 2001, Proceedings 28th Annual International Symposium on Computer Architecture.
[23] M. Oskin,et al. Active Pages: a computation model for intelligent memory , 1998, Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No.98CB36235).
[24] David H. Bailey,et al. The NAS parallel benchmarks summary and preliminary results , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).
[25] John L. Henning. SPEC CPU2006 benchmark descriptions , 2006, CARN.
[26] Mark Horowitz,et al. 1.1 Computing's energy problem (and what we can do about it) , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).
[27] Lieven Eeckhout,et al. The Load Slice Core microarchitecture , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[28] Dionisios N. Pnevmatikatos,et al. Slice-processors: an implementation of operation-based prediction , 2001, ICS '01.
[29] Christoforos E. Kozyrakis,et al. A case for intelligent RAM , 1997, IEEE Micro.
[30] Sheng Li. An integrated power, area, and timing modeling framework for the design of multithreaded and multi/manycore architectures , 2010 .
[31] Seung-Moon Yoo,et al. FlexRAM: Toward an advanced Intelligent Memory system , 1999, 2012 IEEE 30th International Conference on Computer Design (ICCD).
[32] Karthikeyan Sankaralingam,et al. Idempotent processor architecture , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[33] Gurindar S. Sohi,et al. A quantitative framework for automated pre-execution thread selection , 2002, 35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings..
[34] Dr. Jurij Šilc,et al. Processor Architecture , 1999, Springer Berlin Heidelberg.
[35] Harish Patil,et al. Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.