Factored multi-core architectures
暂无分享,去创建一个
[1] Glenn Reinman,et al. Selective value prediction , 1999, ISCA.
[2] Brad Calder,et al. Basic block distribution analysis to find periodic behavior and simulation points in applications , 2001, Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques.
[3] Kai Wang,et al. Highly accurate data value prediction using hybrid predictors , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[4] David Blaauw,et al. Razor: A Low-Power Pipeline Based on Circuit-Level Timing Speculation , 2003, MICRO.
[5] R. Viswanath. Thermal Performance Challenges from Silicon to Systems , 2000 .
[6] James E. Smith,et al. Managing multi-configuration hardware via dynamic working set analysis , 2002, ISCA.
[7] Vikas Agarwal,et al. Clock rate versus IPC: the end of the road for conventional microarchitectures , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[8] James E. Smith,et al. The predictability of data values , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[9] Paul T. Hulina,et al. A decoupled access/execute architecture for efficient access of structured data , 1993, [1993] Proceedings of the Twenty-sixth Hawaii International Conference on System Sciences.
[10] Glenn Reinman,et al. An Evaluation of Deeply Decoupled Cores , 2006, J. Instr. Level Parallelism.
[11] Jean-Loup Baer,et al. Effective Hardware Based Data Prefetching for High-Performance Processors , 1995, IEEE Trans. Computers.
[12] Margaret Martonosi,et al. Temperature-Aware Design Issues for SMT and CMP Architectures , 2004 .
[13] Norman P. Jouppi,et al. Conjoined-Core Chip Multiprocessing , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).
[14] José González,et al. Power-aware control speculation through selective throttling , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..
[15] David J. Sager,et al. The microarchitecture of the Pentium 4 processor , 2001 .
[16] Kevin Skadron,et al. Temperature-aware microarchitecture , 2003, ISCA '03.
[17] John Paul Shen,et al. Efficacy and performance impact of value prediction , 1998, Proceedings. 1998 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.98EX192).
[18] Mark Bohr. Silicon trends and limits for advanced microprocessors , 1998, CACM.
[19] R. D. Barnes,et al. An Architectural Framework for Run-Time Optimization , 2001 .
[20] Norman P. Jouppi,et al. Single-ISA heterogeneous multi-core architectures for multithreaded workload performance , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..
[22] Sally A. McKee,et al. Hitting the memory wall: implications of the obvious , 1995, CARN.
[23] James E. Smith,et al. Comparing Program Phase Detection Techniques , 2003, MICRO.
[24] Dirk Grunwald,et al. Thermal Management with Asymmetric Dual Core Designs , 2003 .
[25] Seung-Moon Yoo,et al. A framework for dynamic energy efficiency and temperature management , 2000, Proceedings 33rd Annual IEEE/ACM International Symposium on Microarchitecture. MICRO-33 2000.
[26] James E. Smith,et al. Instruction-Level Distributed Processing , 2001, Computer.
[27] James E. Smith,et al. Complexity-Effective Superscalar Processors , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[28] José González,et al. The potential of data value speculation to boost ILP , 1998, ICS '98.
[29] Chris Wilkerson,et al. Locality vs. criticality , 2001, ISCA 2001.
[30] William H. Mangione-Smith,et al. The filter cache: an energy efficient memory structure , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[31] James E. Smith,et al. An instruction set and microarchitecture for instruction level distributed processing , 2002, ISCA.
[32] Todd C. Mowry,et al. Cooperative prefetching: compiler and hardware support for effective instruction prefetching in modern processors , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.
[33] S. McFarling. Combining Branch Predictors , 1993 .
[34] Brad Calder,et al. Phase tracking and prediction , 2003, ISCA '03.
[35] Richard E. Kessler,et al. Evaluating stream buffers as a secondary cache replacement , 1994, Proceedings of 21 International Symposium on Computer Architecture.
[36] Jean-Loup Baer,et al. Reducing memory latency via non-blocking and prefetching caches , 1992, ASPLOS V.
[37] John Paul Shen,et al. Instruction path coprocessors , 2000, ISCA '00.
[38] David H. Albonesi,et al. Selective cache ways: on-demand cache resource allocation , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.
[39] Kunle Olukotun,et al. A Single-Chip Multiprocessor , 1997, Computer.
[40] Rajeev Balasubramonian,et al. Reducing the complexity of the register file in dynamic superscalar processors , 2001, MICRO.
[41] Yale N. Patt,et al. Reducing the performance impact of instruction cache misses by writing instructions into the reservation stations out-of-order , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[42] Krste Asanovic,et al. Reducing power density through activity migration , 2003, ISLPED '03.
[43] T. N. Vijaykumar,et al. Heat-and-run: leveraging SMT and CMP to manage power density through the operating system , 2004, ASPLOS XI.
[44] Todd M. Austin,et al. The SimpleScalar tool set, version 2.0 , 1997, CARN.
[45] Sang Jeong Lee,et al. Decoupled value prediction on trace processors , 2000, Proceedings Sixth International Symposium on High-Performance Computer Architecture. HPCA-6 (Cat. No.PR00550).
[46] Gurindar S. Sohi,et al. ARB: A Hardware Mechanism for Dynamic Reordering of Memory References , 1996, IEEE Trans. Computers.
[47] Norman P. Jouppi,et al. The multicluster architecture: reducing cycle time through partitioning , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[48] Margaret Martonosi,et al. Wattch: a framework for architectural-level power analysis and optimizations , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[49] Anoop Gupta,et al. Design and evaluation of a compiler algorithm for prefetching , 1992, ASPLOS V.
[50] Andrew R. Pleszkun,et al. Implementation of precise interrupts in pipelined processors , 1985, ISCA '98.
[51] Glenn Reinman,et al. A scalable front-end architecture for fast instruction delivery , 1999, ISCA.
[52] Brad Calder,et al. Automatically characterizing large scale program behavior , 2002, ASPLOS X.
[53] James E. Smith,et al. Modeling program predictability , 1998, ISCA.
[54] Norman P. Jouppi,et al. Reconfigurable caches and their application to media processing , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[55] Trevor Mudge. Power: A First Class Design Constraint for Future Architecture and Automation , 2000, HiPC.
[56] Norman P. Jouppi,et al. Cacti 3. 0: an integrated cache timing, power, and area model , 2001 .
[57] André Seznec,et al. CASH: Revisiting Hardware Sharing in Single-Chip Parallel Processors , 2004, J. Instr. Level Parallelism.
[58] Brad Calder,et al. Time Varying Behavior of Programs , 1999 .
[59] Kevin Skadron,et al. HotLeakage: A Temperature-Aware Model of Subthreshold and Gate Leakage for Architects , 2003 .
[60] Gurindar S. Sohi,et al. A static power model for architects , 2000, MICRO 33.
[61] Kaustav Banerjee,et al. Analysis of non-uniform temperature-dependent interconnect performance in high performance ICs , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).
[62] W. Robert Daasch,et al. A thermal-aware superscalar microprocessor , 2002, Proceedings International Symposium on Quality Electronic Design.
[63] Yale N. Patt,et al. A comprehensive instruction fetch mechanism for a processor supporting speculative execution , 1992, MICRO 25.
[64] David Kroft,et al. Lockup-free instruction fetch/prefetch cache organization , 1998, ISCA '81.
[65] Norman P. Jouppi,et al. Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction , 2003, MICRO.
[66] P. Chow,et al. Memory-system Design Considerations For Dynamically-scheduled Processors , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[67] Diana Marculescu,et al. Power aware microarchitecture resource scaling , 2001, Proceedings Design, Automation and Test in Europe. Conference and Exhibition 2001.
[68] Norman P. Jouppi,et al. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.
[69] Glenn Reinman,et al. Tornado warning: the perils of selective replay in multithreaded processors , 2005, ICS '05.
[70] Maurice V. Wilkes,et al. The memory wall and the CMOS end-point , 1995, CARN.
[71] Stephen H. Gunther,et al. Managing the Impact of Increasing Microprocessor Power Consumption , 2001 .
[72] Todd M. Austin,et al. DIVA: a reliable substrate for deep submicron microarchitecture design , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.
[73] Dean M. Tullsen,et al. Simultaneous multithreading: Maximizing on-chip parallelism , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[74] M TullsenDean,et al. Symbiotic jobscheduling for a simultaneous mutlithreading processor , 2000 .
[75] Rajeev Balasubramonian,et al. Memory hierarchy reconfiguration for energy and performance in general-purpose processor architectures , 2000, MICRO 33.
[76] Francisco Tirado,et al. A power perspective of value speculation for superscalar microprocessors , 2000, Proceedings 2000 International Conference on Computer Design.
[77] James E. Smith,et al. Concurrent garbage collection using hardware-assisted profiling , 2000, ISMM '00.
[78] Martin Burtscher,et al. Prediction Outcome History-Based Confidence Estimation for Load Value Prediction , 1999, J. Instr. Level Parallelism.
[79] James E. Smith,et al. Prefetching in supercomputer instruction caches , 1992, Proceedings Supercomputing '92.
[80] Joel S. Emer,et al. Loose loops sink chips , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.
[81] John Paul Shen,et al. Efficient and Accurate Value Prediction Using Dynamic Classification , 1998 .
[82] Douglas J. Joseph,et al. Prefetching Using Markov Predictors , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[83] Kevin Skadron,et al. Performance, energy, and thermal considerations for SMT and CMP architectures , 2005, 11th International Symposium on High-Performance Computer Architecture.
[84] Margaret Martonosi,et al. Dynamic thermal management for high-performance microprocessors , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.
[85] Michael C. Huang,et al. Positional adaptation of processors: application to energy reduction , 2003, ISCA '03.
[86] A.S. Dhodapkar,et al. Dynamic microarchitecture adaptation via co-designed virtual machines , 2002, 2002 IEEE International Solid-State Circuits Conference. Digest of Technical Papers (Cat. No.02CH37315).
[87] Manish Gupta,et al. Power-Aware Microarchitecture: Design and Modeling Challenges for Next-Generation Microprocessors , 2000, IEEE Micro.
[88] Pascal Sainrat,et al. Multiple-block ahead branch predictors , 1996, ASPLOS VII.
[89] Wen-mei W. Hwu,et al. Vacuum packing: extracting hardware-detected program phases for post-link optimization , 2002, MICRO.
[90] Lian-Tuu Yeh,et al. Thermal Management of Microelectronic Equipment , 2002 .
[91] Brad Calder,et al. Predictor-directed stream buffers , 2000, MICRO 33.
[92] Lizy Kurian John,et al. Latency and energy aware value prediction for high-frequency processors , 2002, ICS '02.
[93] Mikko H. Lipasti,et al. Value locality and load value prediction , 1996, ASPLOS VII.
[94] F. Gabbay. Speculative Execution based on Value Prediction Research Proposal towards the Degree of Doctor of Sciences , 1996 .
[95] Eric Sprangle,et al. Increasing processor performance by implementing deeper pipelines , 2002, ISCA.
[96] Andreas Moshovos,et al. Dependence based prefetching for linked data structures , 1998, ASPLOS VIII.
[97] Mikko H. Lipasti,et al. Exceeding the dataflow limit via value prediction , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.
[98] M. Bohr. Interconnect scaling-the real limiter to high performance ULSI , 1995, Proceedings of International Electron Devices Meeting.