Analytical performance analysis and modeling of superscalar and multi-threaded processors
暂无分享,去创建一个
[1] Balaram Sinharoy,et al. POWER5 system microarchitecture , 2005, IBM J. Res. Dev..
[2] R. Govindarajan,et al. Evaluating register allocation and instruction scheduling techniques in out-of-order issue processors , 1999, 1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425).
[3] Brinkley Sprunt,et al. Pentium 4 Performance-Monitoring Features , 2002, IEEE Micro.
[4] Brad Calder,et al. Picking statistically valid and early simulation points , 2003, 2003 12th International Conference on Parallel Architectures and Compilation Techniques.
[5] David M. Brooks,et al. Accurate and efficient regression modeling for microarchitectural performance and power prediction , 2006, ASPLOS XII.
[6] James E. Smith,et al. The microarchitecture of superscalar processors , 1995, Proc. IEEE.
[7] Dean M. Tullsen,et al. Symbiotic jobscheduling for a simultaneous mutlithreading processor , 2000, SIGP.
[8] T. Puzak,et al. The optimum pipeline depth for a microprocessor , 2002, Proceedings 29th Annual International Symposium on Computer Architecture.
[9] Andrew F. Glew. MLP yes! ILP no , 1998, ASPLOS 1998.
[10] Steven K. Reinhardt,et al. The impact of resource partitioning on SMT processors , 2003, 2003 12th International Conference on Parallel Architectures and Compilation Techniques.
[11] Kapil Vaswani,et al. A Predictive Performance Model for Superscalar Processors , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[12] Thomas R. Puzak,et al. Optimum power/performance pipeline depth , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..
[13] Roland E. Wunderlich,et al. SMARTS: accelerating microarchitecture simulation via rigorous statistical sampling , 2003, 30th Annual International Symposium on Computer Architecture, 2003. Proceedings..
[14] Kapil Vaswani,et al. Construction and use of linear regression models for processor performance analysis , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..
[15] Greg Hamerly,et al. SimPoint 3.0: Faster and More Flexible Program Analysis , 2005 .
[16] Tejas Karkhanis,et al. A Day in the Life of a Data Cache Miss , 2002 .
[17] James E. Smith,et al. Optimal Pipelining in Supercomputers , 1986, ISCA.
[18] Eric Sprangle,et al. Increasing processor performance by implementing deeper pipelines , 2002, ISCA.
[19] Sarita V. Adve,et al. Performance of database workloads on shared-memory systems with out-of-order processors , 1998, ASPLOS VIII.
[20] Yan Solihin,et al. Fair cache sharing and partitioning in a chip multiprocessor architecture , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..
[21] Gregory F. Grohoski,et al. Machine Organization of the IBM RISC System/6000 Processor , 1990, IBM J. Res. Dev..
[22] William Stallings,et al. Operating Systems: Internals and Design Principles , 1991 .
[23] Dean M. Tullsen,et al. Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).
[24] Stéphan Jourdan,et al. Exploring instruction-fetch bandwidth requirement in wide-issue superscalar processors , 1999, 1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425).
[25] James E. Smith,et al. A first-order superscalar processor model , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..
[26] John Paul Shen,et al. A framework for statistical modeling of superscalar processor performance , 1997, Proceedings Third International Symposium on High-Performance Computer Architecture.
[27] Dean M. Tullsen,et al. Symbiotic jobscheduling with priorities for a simultaneous multithreading processor , 2002, SIGMETRICS '02.
[28] Avi Mendelson,et al. Fairness enforcement in switch on event multithreading , 2007, TACO.
[29] A. J. KleinOsowski,et al. MinneSPEC: A New SPEC Benchmark Workload for Simulation-Based Computer Architecture Research , 2002, IEEE Computer Architecture Letters.
[30] Lieven Eeckhout,et al. Memory Data Flow Modeling in Statistical Simulation for the Efficient Exploration of Microprocessor Design Spaces , 2008, IEEE Transactions on Computers.
[31] Dean M. Tullsen,et al. Fellowship - Simulation And Modeling Of A Simultaneous Multithreading Processor , 1996, Int. CMG Conference.
[32] James E. Smith,et al. Modeling superscalar processors via statistical simulation , 2001, Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques.
[33] Stéphan Jourdan,et al. An Exploration of Instruction Fetch Requirement in Out-of-Order Superscalar Processors , 2004, International Journal of Parallel Programming.
[34] Rohit Jain,et al. Soft real-time scheduling on simultaneous multithreaded processors , 2002, 23rd IEEE Real-Time Systems Symposium, 2002. RTSS 2002..
[35] James E. Smith,et al. Fair Queuing Memory Systems , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[36] David W. Wall,et al. Limits of instruction-level parallelism , 1991, ASPLOS IV.
[37] Edward M. Riseman,et al. The Inhibition of Potential Parallelism by Conditional Jumps , 1972, IEEE Transactions on Computers.
[38] John L. Henning. SPEC CPU2000: Measuring CPU Performance in the New Millennium , 2000, Computer.
[39] James E. Smith,et al. Virtual private caches , 2007, ISCA '07.
[40] Yan Solihin,et al. QoS policies and architecture for cache/memory in CMP platforms , 2007, SIGMETRICS '07.
[41] David B. Papworth. Tuning the Pentium Pro microarchitecture , 1996, IEEE Micro.
[42] Sally A. McKee,et al. Efficiently exploring architectural design spaces via predictive modeling , 2006, ASPLOS XII.
[43] Douglas M. Hawkins,et al. Characterizing and comparing prevailing simulation techniques , 2005, 11th International Symposium on High-Performance Computer Architecture.
[44] S. Turner,et al. Performance Analysis Using the MIPS R10000 Performance Counters , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.
[45] Onur Mutlu,et al. Stall-Time Fair Memory Access Scheduling for Chip Multiprocessors , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[46] Dean M. Tullsen,et al. Simultaneous multithreading: Maximizing on-chip parallelism , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[47] Alex Mericas. Performance Monitoring on the POWER5™ Microprocessor , 2005 .
[48] Santosh G. Abraham,et al. Efficient simulation of caches under optimal replacement with applications to miss characterization , 1993, SIGMETRICS '93.
[49] Ravi R. Iyer,et al. CQoS: a framework for enabling QoS in shared caches of CMP platforms , 2004, ICS '04.
[50] Frederic T. Chong,et al. HLS: combining statistical and symbolic simulation to guide microprocessor designs , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[51] Tarek M. Taha,et al. An Instruction Throughput Model of Superscalar Processors , 2008, IEEE Trans. Computers.
[52] Onur Mutlu,et al. Runahead execution: an alternative to very large instruction windows for out-of-order processors , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..
[53] Manoj Franklin,et al. Balancing thoughput and fairness in SMT processors , 2001, 2001 IEEE International Symposium on Performance Analysis of Systems and Software. ISPASS..
[54] Pradip Bose,et al. Optimizing pipelines for power and performance , 2002, 35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings..
[55] Eric M. Schwarz,et al. IBM POWER6 microarchitecture , 2007, IBM J. Res. Dev..
[56] D. Patterson,et al. Performance characterization of a quad Pentium Pro SMP using OLTP workloads , 1998, Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No.98CB36235).
[57] Lizy Kurian John,et al. Benchmarking Internet Servers on Superscalar Machines , 2001 .
[58] James E. Smith,et al. Automated design of application specific superscalar processors: an analytical approach , 2007, ISCA '07.
[59] Dean M. Tullsen,et al. Handling long-latency loads in a simultaneous multithreading processor , 2001, Proceedings. 34th ACM/IEEE International Symposium on Microarchitecture. MICRO-34.
[60] Trevor Mudge,et al. MiBench: A free, commercially representative embedded benchmark suite , 2001 .
[61] Simcha Gochman,et al. Introduction to Intel Core Duo Processor Architecture , 2006 .