Concise loads and stores: The case for an asymmetric compute-memory architecture for approximation
暂无分享,去创建一个
Scott A. Mahlke | Lingjia Tang | Parker Hill | Jason Mars | Michael Laurenzano | Muneeb Khan | Shih-Chieh Lin | Animesh Jain | Md E. Haque | Animesh Jain | Parker Hill | Shi-Chieh Lin | Muneeb Khan | Md E. Haque | M. Laurenzano | S. Mahlke | Lingjia Tang | Jason Mars
[1] Donald J. Patterson,et al. Computer organization and design: the hardware-software interface (appendix a , 1993 .
[2] Chau-Wen Tseng,et al. Data transformations for eliminating conflict misses , 1998, PLDI.
[3] Lim Lim Hwee,et al. Dishing and nitride erosion of STI-CMP for different integration schemes , 2001 .
[4] Eric Rotenberg,et al. Retention-aware placement in DRAM (RAPID): software methods for quasi-non-volatile DRAM , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..
[5] Norman P. Jouppi,et al. CACTI 6.0: A Tool to Model Large Caches , 2009 .
[6] Kevin Skadron,et al. Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[7] Jun Yang,et al. A durable and energy efficient main memory using phase change memory technology , 2009, ISCA '09.
[8] Jung Ho Ahn,et al. McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[9] Onur Mutlu,et al. Architecting phase change memory as a scalable dram alternative , 2009, ISCA '09.
[10] Vijayalakshmi Srinivasan,et al. Scalable high performance main memory system using phase-change memory technology , 2009, ISCA '09.
[11] Anand Raghunathan,et al. Best-effort parallel execution framework for Recognition and mining applications , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.
[12] Woongki Baek,et al. Green: a framework for supporting energy-conscious programming using controlled approximation , 2010, PLDI '10.
[13] Teresa H. Y. Meng,et al. Towards program optimization through automated analysis of numerical precision , 2010, CGO '10.
[14] Bradford M. Beckmann,et al. The gem5 simulator , 2011, CARN.
[15] Henry Hoffmann,et al. Dynamic knobs for responsive power-aware computing , 2011, ASPLOS XVI.
[16] Dan Grossman,et al. EnerJ: approximate data types for safe and general low-power computation , 2011, PLDI '11.
[17] Song Liu,et al. Flikker: saving DRAM refresh-power through critical data partitioning , 2011, ASPLOS XVI.
[18] Onur Mutlu,et al. Base-delta-immediate compression: Practical data compression for on-chip caches , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).
[19] Luis Ceze,et al. Architecture support for disciplined approximate programming , 2012, ASPLOS XVII.
[20] Lifan Xu,et al. Auto-tuning a high-level language targeted to GPU codes , 2012, 2012 Innovative Parallel Computing (InPar).
[21] Jacob Nelson,et al. Approximate storage in solid-state memories , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[22] Somayeh Sardashti,et al. Decoupled compressed cache: Exploiting spatial locality for energy-optimized compressed caching , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[23] James Demmel,et al. Precimonious: Tuning assistant for floating-point precision , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[24] Luis Ceze,et al. Neural Acceleration for General-Purpose Approximate Programs , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.
[25] Kaushik Roy,et al. Analysis and characterization of inherent application resilience for approximate computing , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).
[26] Mario Badr,et al. Load Value Approximation , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[27] Kathryn S. McKinley,et al. Uncertain: a first-order type for uncertain data , 2014, ASPLOS.
[28] Somayeh Sardashti,et al. Skewed Compressed Caches , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[29] Per Stenström,et al. SC2: A statistical compression cache scheme , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).
[30] Dan Grossman,et al. Expressing and verifying probabilistic assertions , 2014, PLDI.
[31] Kia Bazargan,et al. Axilog: Language support for approximate hardware design , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[32] Song Han,et al. Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.
[33] Hadi Esmaeilzadeh,et al. Neural acceleration for GPU throughput processors , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[34] Natalie D. Enright Jerger,et al. Doppelgänger: A cache for approximate computing , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[35] Ronald G. Dreslinski,et al. Sirius: An Open End-to-End Voice and Vision Personal Assistant and Its Implications for Future Warehouse Scale Computers , 2015, ASPLOS.
[36] Arnab Raha,et al. Quality-aware data allocation in approximate DRAM* , 2015, 2015 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES).
[37] Pritish Narayanan,et al. Deep Learning with Limited Numerical Precision , 2015, ICML.
[38] Yoshua Bengio,et al. Low precision arithmetic for deep learning , 2014, ICLR.
[39] Scott A. Mahlke,et al. Input responsiveness: using canary inputs to dynamically steer approximation , 2016, PLDI.
[40] Martin C. Rinard,et al. Verifying quantitative reliability for programs that execute on unreliable hardware , 2013, OOPSLA.