Power-Efficient Computer Architectures: Recent Advances
暂无分享,去创建一个
[1] Jian Li,et al. Dynamic power-performance adaptation of parallel computation on chip multiprocessors , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..
[2] Lieven Eeckhout,et al. Scheduling heterogeneous multi-cores through performance impact estimation (PIE) , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).
[3] Nick Barrow-Williams,et al. Proximity coherence for chip multiprocessors , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).
[4] Ki Hwan Yum,et al. Adaptive data compression for high-performance low-power on-chip networks , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.
[5] David B. Whalley,et al. Speculative tag access for reduced energy dissipation in set-associative L1 data caches , 2013, 2013 IEEE 31st International Conference on Computer Design (ICCD).
[6] David Eklov,et al. Efficient software-based online phase classification , 2011, 2011 IEEE International Symposium on Workload Characterization (IISWC).
[7] William J. Dally,et al. Unifying Primary Cache, Scratch, and Register File Memories in a Throughput Processor , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.
[8] Wen-mei W. Hwu,et al. Data Layout Transformation Exploiting Memory-Level Parallelism in Structured Grid Many-Core Applications , 2010, International Journal of Parallel Programming.
[9] Andreas Sembrant,et al. Power-Sleuth: A Tool for Investigating Your Program's Power Behavior , 2012, 2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.
[10] Natalie D. Enright Jerger,et al. Virtual Circuit Tree Multicasting: A Case for On-Chip Hardware Multicast Support , 2008, 2008 International Symposium on Computer Architecture.
[11] Kazuaki Murakami,et al. Way-predicting set-associative cache for high performance and low energy consumption , 1999, Proceedings. 1999 International Symposium on Low Power Electronics and Design (Cat. No.99TH8477).
[12] Li-Shiuan Peh,et al. SWIFT: A Low-Power Network-On-Chip Implementing the Token Flow Control Router Architecture With Swing-Reduced Interconnects , 2013, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[13] Balaram Sinharoy,et al. IBM POWER7 multicore server processor , 2011 .
[14] Timothy Mattson,et al. A 48-Core IA-32 message-passing processor with DVFS in 45nm CMOS , 2010, 2010 IEEE International Solid-State Circuits Conference - (ISSCC).
[15] Yuval Peress,et al. Guaranteeing instruction fetch behavior with a lookahead instruction fetch engine (LIFE) , 2009, LCTES '09.
[16] Per Hammarlund,et al. 4th generation Intel™ Core processor, codenamed Haswell , 2013, 2013 IEEE Hot Chips 25 Symposium (HCS).
[17] Alaa R. Alameldeen,et al. Trading Off Cache Capacity for Low-Voltage Operation , 2009, IEEE Micro.
[18] Ravi Iyengar,et al. 28nm high- metal-gate heterogeneous quad-core CPUs for high-performance and energy-efficient mobile application processor , 2013, 2013 IEEE International Solid-State Circuits Conference Digest of Technical Papers.
[19] Shubhendu S. Mukherjee,et al. Measuring Architectural Vulnerability Factors , 2003, IEEE Micro.
[20] D. Blaauw,et al. Opportunities and challenges for better than worst-case design , 2005, Proceedings of the ASP-DAC 2005. Asia and South Pacific Design Automation Conference, 2005..
[21] Margaret Martonosi,et al. Formal online methods for voltage/frequency control in multiple clock domain microprocessors , 2004, ASPLOS XI.
[22] David B. Whalley,et al. Designing a practical data filter cache to improve both energy efficiency and performance , 2013, ACM Trans. Archit. Code Optim..
[23] Stefanos Kaxiras,et al. Green governors: A framework for Continuously Adaptive DVFS , 2011, 2011 International Green Computing Conference and Workshops.
[24] R. E. Kessler. The Cavium 32 Core OCTEON II 68xx , 2011, 2011 IEEE Hot Chips 23 Symposium (HCS).
[25] Sandhya Dwarkadas,et al. Dynamic frequency and voltage control for a multiple clock domain microarchitecture , 2002, 35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings..
[26] Andrew A. Chien,et al. The future of microprocessors , 2011, Commun. ACM.
[27] Trevor Mudge,et al. Automatic Performance Setting for Dynamic Voltage Scaling , 2002 .
[28] Kai Li,et al. The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[29] Li Shang,et al. Dynamic voltage scaling with links for power optimization of interconnection networks , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..
[30] Sang-Hyun Oh. Physics and technologies of vertical transistors , 2001 .
[31] Mark Horowitz,et al. 1.1 Computing's energy problem (and what we can do about it) , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).
[32] Jung Ho Ahn,et al. McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[33] Michael L. Scott,et al. Integrating adaptive on-chip storage structures for reduced dynamic power , 2002, Proceedings.International Conference on Parallel Architectures and Compilation Techniques.
[34] Sharad Malik,et al. Compile-time dynamic voltage scaling settings: opportunities and limits , 2003, PLDI '03.
[35] Dave Brown,et al. Supplementary Material for An Efficient and Scalable Semiconductor Architecture for Parallel Automata Processing , 2013 .
[36] José González,et al. Meeting points: Using thread criticality to adapt multicore hardware to parallel regions , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[37] Varghese George,et al. Power management of the third generation intel core micro architecture formerly codenamed ivy bridge , 2012, 2012 IEEE Hot Chips 24 Symposium (HCS).
[38] Vikram Bhatt,et al. GreenDroid: An architecture for the Dark Silicon Age , 2012, 17th Asia and South Pacific Design Automation Conference.
[39] Luis Ceze,et al. General-purpose code acceleration with limited-precision analog computation , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).
[40] Margaret Martonosi,et al. Live, Runtime Phase Monitoring and Prediction on Real Systems with Application to Dynamic Power Management , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[41] Margaret Martonosi,et al. A dynamic compilation framework for controlling microprocessor energy and performance , 2005, 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05).
[42] Meeta Sharma Gupta,et al. System level analysis of fast, per-core DVFS using on-chip switching regulators , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.
[43] Marcelo Yuffe,et al. A fully integrated multi-CPU, GPU and memory controller 32nm processor , 2011, 2011 IEEE International Solid-State Circuits Conference.
[44] David Blaauw,et al. Limits of Parallelism and Boosting in Dim Silicon , 2013, IEEE Micro.
[45] Stijn Eyerman,et al. A Counter Architecture for Online DVFS Profitability Estimation , 2010, IEEE Transactions on Computers.
[46] C. Hu,et al. FinFET-a self-aligned double-gate MOSFET scalable to 20 nm , 2000 .
[47] Luis Ceze,et al. Neural Acceleration for General-Purpose Approximate Programs , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.
[48] Milos D. Ercegovac,et al. The Art of Deception: Adaptive Precision Reduction for Area Efficient Physics Acceleration , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[49] Gu-Yeon Wei,et al. The accelerator store: A shared memory framework for accelerator-based systems , 2012, TACO.
[50] Eric S. Chung,et al. LINQits: big data on little clients , 2013, ISCA.
[51] William H. Mangione-Smith,et al. The filter cache: an energy efficient memory structure , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[52] Luca P. Carloni,et al. Networks-on-chip in emerging interconnect paradigms: Advantages and challenges , 2009, 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip.
[53] Norman P. Jouppi,et al. Designing, packaging, and testing a 300-MHz, 115 W ECL microprocessor , 1994, IEEE Micro.
[54] Lin Zhong,et al. Self-constructive high-rate system energy modeling for battery-powered mobile systems , 2011, MobiSys '11.
[55] David B. Whalley,et al. Reducing instruction fetch energy in multi-issue processors , 2013, ACM Trans. Archit. Code Optim..
[56] David Blaauw,et al. A 4.5Tb/s 3.4Tb/s/W 64×64 switch fabric with self-updating least-recently-granted priority and quality-of-service arbitration in 45nm CMOS , 2012, 2012 IEEE International Solid-State Circuits Conference.
[57] Margaret Martonosi,et al. Dynamic-Compiler-Driven Control for Microprocessor Energy and Performance , 2006, IEEE Micro.
[58] Yale N. Patt,et al. Predicting Performance Impact of DVFS for Realistic Memory Systems , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.
[59] Andreas Moshovos. RegionScout: Exploiting Coarse Grain Sharing in Snoop-Based Coherence , 2005, ISCA 2005.
[60] Matthias A. Blumrich,et al. Design and implementation of the blue gene/P snoop filter , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.
[61] Frank Vahid,et al. A Way-Halting Cache for Low-Energy High-Performance Systems , 2005, IEEE Computer Architecture Letters.
[62] Naveen Verma,et al. A Micro-Power EEG Acquisition SoC With Integrated Feature Extraction Processor for a Chronic Seizure Detection System , 2010, IEEE Journal of Solid-State Circuits.
[63] Scott Shenker,et al. Scheduling for reduced CPU energy , 1994, OSDI '94.
[64] Emil Talpes,et al. Toward a multiple clock/voltage island design style for power-aware processors , 2005, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[65] Margaret Martonosi,et al. Identifying program power phase behavior using power vectors , 2003, 2003 IEEE International Conference on Communications (Cat. No.03CH37441).
[66] Michael Bedford Taylor,et al. Is dark silicon useful? Harnessing the four horsemen of the coming dark silicon apocalypse , 2012, DAC Design Automation Conference 2012.
[67] William H. Mangione-Smith,et al. Filtering Memory References to Increase Energy Efficiency , 2000, IEEE Trans. Computers.
[68] Alyssa B. Apsel,et al. Leveraging Optical Technology in Future Bus-based Chip Multiprocessors , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[69] Sharad Malik,et al. Intraprogram dynamic voltage scaling: Bounding opportunities with analytic modeling , 2004, TACO.
[70] Murali Annavaram,et al. Mitigating Amdahl's Law through EPI Throttling , 2005, ISCA 2005.
[71] Vijayalakshmi Srinivasan,et al. A Tagless Coherence Directory , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[72] Norman P. Jouppi,et al. Single-ISA heterogeneous multi-core architectures for multithreaded workload performance , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..
[73] Friedemann Mattern,et al. From the Internet of Computers to the Internet of Things , 2010, From Active Data Management to Event-Based Systems and More.
[74] James R. Larus,et al. A reconfigurable fabric for accelerating large-scale datacenter services , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).
[75] Scott B. Baden,et al. Redefining the Role of the CPU in the Era of CPU-GPU Integration , 2012, IEEE Micro.
[76] Kaushik Roy,et al. Quality programmable vector processors for approximate computing , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[77] Natalie D. Enright Jerger,et al. Virtual tree coherence: Leveraging regions and in-network multicast trees for scalable cache coherence , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.
[78] Kaushik Roy,et al. Reducing set-associative cache energy via way-prediction and selective direct-mapping , 2001, Proceedings. 34th ACM/IEEE International Symposium on Microarchitecture. MICRO-34.
[79] Stefanos Kaxiras,et al. Interval-based models for run-time DVFS orchestration in superscalar processors , 2010, CF '10.
[80] Niraj K. Jha,et al. Energy efficiency of handheld computer interfaces: limits, characterization and practice , 2005, MobiSys '05.
[81] Vikram Bhatt,et al. The GreenDroid Mobile Application Processor: An Architecture for Silicon's Dark Future , 2011, IEEE Micro.
[82] Anantha P. Chandrakasan,et al. Low-power CMOS digital design , 1992 .
[83] Lizy Kurian John,et al. Scaling to the end of silicon with EDGE architectures , 2004, Computer.
[84] Anantha Chandrakasan,et al. Approaching the theoretical limits of a mesh NoC with a 16-node chip prototype in 45nm SOI , 2012, DAC Design Automation Conference 2012.
[85] Alan Gara,et al. Improving the accuracy of snoop filtering using stream registers , 2007, MEDEA '07.
[86] Shekhar Y. Borkar,et al. Design challenges of technology scaling , 1999, IEEE Micro.
[87] Babak Falsafi,et al. TurboTag: Lookup filtering to reduce coherence directory power , 2010, 2010 ACM/IEEE International Symposium on Low-Power Electronics and Design (ISLPED).
[88] Uday Bondhugula,et al. Automatic data movement and computation mapping for multi-level parallel architectures with explicitly managed memories , 2008, PPoPP.
[89] John Sartori,et al. Designing a processor from the ground up to allow voltage/reliability tradeoffs , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.
[90] Robert J. Wood,et al. Hardware in the loop for optical flow sensing in a robotic bee , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[91] Luca P. Carloni,et al. Photonic Networks-on-Chip for Future Generations of Chip Multiprocessors , 2008, IEEE Transactions on Computers.
[92] Naresh R. Shanbhag,et al. Energy-efficient signal processing via algorithmic noise-tolerance , 1999, Proceedings. 1999 International Symposium on Low Power Electronics and Design (Cat. No.99TH8477).
[93] Aneesh Aggarwal,et al. Cache Noise Prediction , 2008, IEEE Transactions on Computers.
[94] James E. Smith,et al. A first-order superscalar processor model , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..
[95] Lin Gao,et al. Memory coloring: a compiler approach for scratchpad memory management , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).
[96] Li-Shiuan Peh,et al. Exploring the Design Space of Self-Regulating Power-Aware On/Off Interconnection Networks , 2007, IEEE Transactions on Parallel and Distributed Systems.
[97] Peter Marwedel,et al. Cache-aware scratchpad allocation algorithm , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.
[98] Uri C. Weiser,et al. Performance, power efficiency and scalability of asymmetric cluster chip multiprocessors , 2006, IEEE Computer Architecture Letters.
[99] William J. Dally,et al. A compile-time managed multi-level register file hierarchy , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[100] Luis Ceze,et al. Architecture support for disciplined approximate programming , 2012, ASPLOS XVII.
[101] Henry Hoffmann,et al. On-Chip Interconnection Architecture of the Tile Processor , 2007, IEEE Micro.
[102] Peter Marwedel,et al. Dynamic overlay of scratchpad memory for energy minimization , 2004, International Conference on Hardware/Software Codesign and System Synthesis, 2004. CODES + ISSS 2004..
[103] Sharad Malik,et al. EPROF: An energy/performance/reliability optimization framework for streaming applications , 2012, 17th Asia and South Pacific Design Automation Conference.
[104] Gu-Yeon Wei,et al. Shrink-Fit: A Framework for Flexible Accelerator Sizing , 2013, IEEE Computer Architecture Letters.
[105] Michael L. Scott,et al. Energy-efficient processor design using multiple clock domains with dynamic voltage and frequency scaling , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.
[106] Luiz André Barroso,et al. The Case for Energy-Proportional Computing , 2007, Computer.
[107] Michael C. Huang,et al. The thrifty barrier: energy-aware synchronization in shared-memory multiprocessors , 2004, 10th International Symposium on High Performance Computer Architecture (HPCA'04).
[108] G. Amdhal,et al. Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).
[109] Hui Feng,et al. Compiler-directed scratchpad memory management via graph coloring , 2009, TACO.
[110] Sally A. McKee,et al. Portable, scalable, per-core power estimation for intelligent resource management , 2010, International Conference on Green Computing.
[111] G.E. Moore,et al. Cramming More Components Onto Integrated Circuits , 1998, Proceedings of the IEEE.
[112] Jason Cong,et al. CMP network-on-chip overlaid with multi-band RF-interconnect , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.
[113] Margaret Martonosi,et al. The XTREM power and performance simulator for the Intel XScale core: Design and experiences , 2007, TECS.
[114] James E. Smith,et al. A performance counter architecture for computing accurate CPI components , 2006, ASPLOS XII.
[115] M. Horowitz,et al. Low-power digital design , 1994, Proceedings of 1994 IEEE Symposium on Low Power Electronics.
[116] David A. Wood,et al. WiDGET: Wisconsin decoupled grid execution tiles , 2010, ISCA.
[117] John Arends,et al. Instruction fetch energy reduction using loop caches for embedded applications with small tight loops , 1999, ISLPED '99.
[118] P. Boyle,et al. A 300-MHz 115-W 32-b bipolar ECL microprocessor , 1993 .
[119] Quinn Jacobson,et al. ERSA: error resilient system architecture for probabilistic applications , 2010, DATE 2010.
[120] Alaa R. Alameldeen,et al. Trading off Cache Capacity for Reliability to Enable Low Voltage Operation , 2008, 2008 International Symposium on Computer Architecture.
[121] Martin Schulz,et al. Practical performance prediction under Dynamic Voltage Frequency Scaling , 2011, 2011 International Green Computing Conference and Workshops.
[122] Naehyuck Chang,et al. Accurate modeling and calculation of delay and energy overheads of dynamic voltage scaling in modern high-performance microprocessors , 2010, 2010 ACM/IEEE International Symposium on Low-Power Electronics and Design (ISLPED).
[123] Ming Zhang,et al. Where is the energy spent inside my app?: fine grained energy accounting on smartphones with Eprof , 2012, EuroSys '12.
[124] R.H. Dennard,et al. Design Of Ion-implanted MOSFET's with Very Small Physical Dimensions , 1974, Proceedings of the IEEE.
[125] Babak Falsafi,et al. JETTY: filtering snoops for reduced energy consumption in SMP servers , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.
[126] David Blaauw,et al. Swizzle-Switch Networks for Many-Core Systems , 2012, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.
[127] Rami G. Melhem,et al. Energy aware scheduling for distributed real-time systems , 2003, Proceedings International Parallel and Distributed Processing Symposium.
[128] Joseph A. Paradiso,et al. Energy scavenging for mobile and wireless electronics , 2005, IEEE Pervasive Computing.
[129] Margaret Martonosi,et al. Techniques for Multicore Thermal Management: Classification and New Exploration , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).
[130] Norman P. Jouppi,et al. Heterogeneous chip multiprocessors , 2005, Computer.
[131] Stijn Eyerman,et al. Criticality stacks: identifying critical threads in parallel programs using synchronization behavior , 2013, ISCA.