Holistic design for multi-core architectures
暂无分享,去创建一个
[1] André Seznec,et al. CASH: Revisiting Hardware Sharing in Single-Chip Parallel Processors , 2004, J. Instr. Level Parallelism.
[2] Norman P. Jouppi,et al. A Multi-Core Approach to Addressing the Energy-Complexity Problem in Microprocessors , 2003 .
[3] Mikko H. Lipasti,et al. A performance methodology for commercial servers , 2000, IBM J. Res. Dev..
[4] Dean M. Tullsen,et al. Simultaneous multithreading: Maximizing on-chip parallelism , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[5] Norman P. Jouppi,et al. Processor Power Reduction Via Single-ISA Heterogeneous Multi-Core Architectures , 2003, IEEE Computer Architecture Letters.
[6] Charles L. Seitz,et al. The cosmic cube , 1985, CACM.
[7] Margaret Martonosi,et al. Run-time power estimation in high performance microprocessors , 2001, ISLPED '01.
[8] Yu Bai,et al. Dynamically Reconfiguring Processor Resources to Reduce Power Consumption in High-Performance Processors , 2000, PACS.
[9] Daniele Folegnani,et al. Reducing Power Consumption of the Issue Logic , 2000 .
[10] Mark Horowitz,et al. Energy dissipation in general purpose microprocessors , 1996, IEEE J. Solid State Circuits.
[11] Dirk Grunwald,et al. Using IPC Variation in Workloads with Externally Specified R ates to Reduce Power Consumption , 2000 .
[12] G.E. Moore,et al. Cramming More Components Onto Integrated Circuits , 1998, Proceedings of the IEEE.
[13] Norman P. Jouppi,et al. Conjoined-Core Chip Multiprocessing , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).
[14] R.H. Dennard,et al. Design Of Ion-implanted MOSFET's with Very Small Physical Dimensions , 1974, Proceedings of the IEEE.
[15] Diana Marculescu,et al. Power aware microarchitecture resource scaling , 2001, Proceedings Design, Automation and Test in Europe. Conference and Exhibition 2001.
[16] Axel Jantsch,et al. Network on Chip : An architecture for billion transistor era , 2000 .
[17] Jaehyuk Huh,et al. Exploiting ILP, TLP, and DLP with the Polymorphous TRIPS Architecture , 2003, IEEE Micro.
[18] John Edward Cronin,et al. Submicron wiring technology with tungsten and planarization , 1987 .
[19] Uri C. Weiser,et al. Performance, power efficiency and scalability of asymmetric cluster chip multiprocessors , 2006, IEEE Computer Architecture Letters.
[20] Burton M. Leary,et al. A 200 MHz 64 b dual-issue CMOS microprocessor , 1992, 1992 IEEE International Solid-State Circuits Conference Digest of Technical Papers.
[21] Brad Calder,et al. Discovering and Exploiting Program Phases , 2003, IEEE Micro.
[22] Kunle Olukotun,et al. Niagara: a 32-way multithreaded Sparc processor , 2005, IEEE Micro.
[23] W. Dally,et al. Route packets, not wires: on-chip interconnection networks , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).
[24] Kunle Olukotun,et al. Maximizing CMP throughput with mediocre cores , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).
[25] Dirk Grunwald,et al. Confidence estimation for speculation control , 1998, ISCA.
[26] Kunle Olukotun,et al. Data speculation support for a chip multiprocessor , 1998, ASPLOS VIII.
[27] John Paul Shen,et al. Mitigating Amdahl's law through EPI throttling , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[28] Anoop Gupta,et al. The Stanford Dash multiprocessor , 1992, Computer.
[29] Norman P. Jouppi,et al. Computer technology and architecture: an evolving interaction , 1991, Computer.
[30] Josep Torrellas,et al. A clustered approach to multithreaded processors , 1998, Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing.
[31] Matthew Mattina,et al. Tarantula: a vector extension to the alpha architecture , 2002, Proceedings 29th Annual International Symposium on Computer Architecture.
[32] Uri C. Weiser,et al. ACCMP-assymetric cluster chip-multiprocessing , 2004 .
[33] James K. Archibald,et al. Cache coherence protocols: evaluation using a multiprocessor simulation model , 1986, TOCS.
[34] Thomas N. Theis,et al. The future of interconnection technology , 2000, IBM J. Res. Dev..
[35] Stephen H. Gunther,et al. Managing the Impact of Increasing Microprocessor Power Consumption , 2001 .
[36] Ashok Kumar,et al. The HP PA-8000 RISC CPU , 1997, IEEE Micro.
[37] S. J. Frank,et al. Tightly coupled multiprocessor system speeds memory-access times , 1984 .
[38] J. Petrovick,et al. The circuit and physical design of the POWER4 microprocessor , 2002, IBM J. Res. Dev..
[39] H. Peter Hofstee,et al. Introduction to the Cell multiprocessor , 2005, IBM J. Res. Dev..
[40] Jaehyuk Huh,et al. Exploring the design space of future CMPs , 2001, Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques.
[41] Doug Burger,et al. An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches , 2002, ASPLOS X.
[42] Mark S. Squillante,et al. Evaluation of Multithreaded Uniprocessors for Commercial Application Environments , 1996, ISCA.
[43] Diana Marculescu,et al. Power and performance evaluation of globally asynchronous locally synchronous processors , 2002, ISCA.
[44] John Paul Shen,et al. Best of both latency and throughput , 2004, IEEE International Conference on Computer Design: VLSI in Computers and Processors, 2004. ICCD 2004. Proceedings..
[45] Dean M. Tullsen,et al. Clustered multithreaded architectures - pursuing both IPC and cycle time , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..
[46] Dean M. Tullsen,et al. Handling long-latency loads in a simultaneous multithreading processor , 2001, Proceedings. 34th ACM/IEEE International Symposium on Microarchitecture. MICRO-34.
[47] Dean M. Tullsen,et al. Fellowship - Simulation And Modeling Of A Simultaneous Multithreading Processor , 1996, Int. CMG Conference.
[48] Balaram Sinharoy,et al. Design and implementation of the POWER5 microprocessor , 2004, Proceedings. 41st Design Automation Conference, 2004..
[49] Dirk Grunwald,et al. Aide de Camp: Asymmetric Dual Core Design for Power and Energy Reduction ; CU-CS-964-03 , 2003 .
[50] Michael J. Flynn,et al. An area model for on-chip memories and its application , 1991 .
[51] Ravi Rajwar,et al. The impact of performance asymmetry in emerging multicore architectures , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[52] Margaret Martonosi,et al. Wattch: a framework for architectural-level power analysis and optimizations , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[53] Soonhoi Ha,et al. A Static Scheduling Heuristic for Heterogeneous Processors , 1996, Euro-Par, Vol. II.
[54] Li-Shiuan Peh,et al. Flow control and micro-architectural mechanisms for extending the performance of interconnection networks , 2001 .
[55] Antonio González,et al. Clustered speculative multithreaded processors , 1999, ICS '99.
[56] Janak H. Patel,et al. A low-overhead coherence solution for multiprocessors with private cache memories , 1984, ISCA '84.
[57] Kevin P. McAuliffe,et al. The IBM Research Parallel Processor Prototype (RP3): Introduction and Architecture , 1985, ICPP.
[58] Brad Calder,et al. Automatically characterizing large scale program behavior , 2002, ASPLOS X.
[59] Donald Yeung,et al. Sparcle: an evolutionary processor design for large-scale multiprocessors , 1993, IEEE Micro.
[60] Kevin Knight,et al. Artificial intelligence (2. ed.) , 1991 .
[61] James Laudon,et al. Performance/Watt: the new server focus , 2005, CARN.
[62] David W. Wall,et al. Limits of instruction-level parallelism , 1991, ASPLOS IV.
[63] Soraya Ghiasi,et al. Scheduling for heterogeneous processors in server systems , 2005, CF '05.
[64] Dean M. Tullsen,et al. Interconnections in multi-core architectures: understanding mechanisms, overheads and scaling , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[65] Hal Wasserman,et al. Comparing algorithm for dynamic speed-setting of a low-power CPU , 1995, MobiCom '95.
[66] David H. Albonesi,et al. Selective cache ways: on-demand cache resource allocation , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.
[67] R. Kotla,et al. Characterizing the impact of different memory-intensity levels , 2004, IEEE International Workshop on Workload Characterization, 2004. WWC-7. 2004.
[68] Kunle Olukotun,et al. A Single-Chip Multiprocessor , 1997, Computer.
[69] Michael L. Scott,et al. Energy-efficient processor design using multiple clock domains with dynamic voltage and frequency scaling , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.
[70] Norman P. Jouppi,et al. Cacti 3. 0: an integrated cache timing, power, and area model , 2001 .
[71] Dirk Grunwald,et al. Pipeline gating: speculation control for energy reduction , 1998, ISCA.
[72] K. Steinhubl. Design of Ion-Implanted MOSFET'S with Very Small Physical Dimensions , 1974 .
[73] Jian Li,et al. Power-Performance Implications of Thread-level Parallelism on Chip Multiprocessors , 2005, IEEE International Symposium on Performance Analysis of Systems and Software, 2005. ISPASS 2005..
[74] Norman P. Jouppi,et al. Single-ISA heterogeneous multi-core architectures: the potential for processor power reduction , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..
[75] Luiz André Barroso,et al. Piranha: a scalable architecture based on single-chip multiprocessing , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[76] R. Hokinson,et al. Implementation of an Alpha microprocessor in SOI , 2003, 2003 IEEE International Solid-State Circuits Conference, 2003. Digest of Technical Papers. ISSCC..
[77] John Paul Shen,et al. Speculative precomputation: long-range prefetching of delinquent loads , 2001, Proceedings 28th Annual International Symposium on Computer Architecture.
[78] Sujit Dey,et al. On-chip communication architecture for OC-768 network processors , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).
[79] David A. Patterson,et al. Computer Architecture: A Quantitative Approach , 1969 .
[80] Mark Horowitz,et al. Scaling, Power and the Future of CMOS , 2007, 20th International Conference on VLSI Design held jointly with 6th International Conference on Embedded Systems (VLSID'07).
[81] Artur W. Klauser,et al. Trends in high-performance microprocessor design , 2001 .
[82] Shashank Gupta,et al. Technology Independent Area and Delay Estimations for MicroprocessorBuilding Blocks , 2001 .
[83] Shreekant S. Thakkar,et al. The Symmetry Multiprocessor System , 1988, ICPP.
[84] Yves Robert,et al. The Master-Slave Paradigm with Heterogeneous Processors , 2001, CLUSTER.
[85] Norman P. Jouppi,et al. Core architecture optimization for heterogeneous chip multiprocessors , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[86] Norman P. Jouppi,et al. Single-ISA heterogeneous multi-core architectures for multithreaded workload performance , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..
[87] Jean-Luc Gaudiot,et al. Area and system clock effects on SMT/CMP processors , 2001, Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques.
[88] Thomas D. Burd,et al. The simulation and evaluation of dynamic voltage scaling algorithms , 1998, Proceedings. 1998 International Symposium on Low Power Electronics and Design (IEEE Cat. No.98TH8379).
[89] Jean-Luc Gaudiot,et al. SMT Layout Overhead and Scalability , 2002, IEEE Trans. Parallel Distributed Syst..
[90] Daniel Gajski,et al. CEDAR: a large scale multiprocessor , 1983, CARN.
[91] T.H. Lee,et al. A 600 MHz superscalar RISC microprocessor with out-of-order execution , 1997, 1997 IEEE International Solids-State Circuits Conference. Digest of Technical Papers.
[92] Andrew W. Wilson,et al. Hierarchical cache/bus architecture for shared memory multiprocessors , 1987, ISCA '87.
[93] Michel Dubois,et al. Synchronization, coherence, and event ordering in multiprocessors , 1988, Computer.
[94] Keith Diefendorff. Compaq chooses smt for alpha: simultaneous multithreading exploits instruction- and thread-level par , 1999 .
[95] Brad Calder,et al. Time Varying Behavior of Programs , 1999 .
[96] Craig Zilles,et al. Execution-based prediction using speculative slices , 2001, ISCA 2001.
[97] Ken Mai,et al. The future of wires , 2001, Proc. IEEE.
[98] Richard E. Kessler,et al. The Alpha 21264 microprocessor architecture , 1998, Proceedings International Conference on Computer Design. VLSI in Computers and Processors (Cat. No.98CB36273).