Performance scalability of decoupled software pipelining
暂无分享,去创建一个
[1] Kunle Olukotun,et al. The Stanford Hydra CMP , 2000, IEEE Micro.
[2] Christopher Hughes,et al. Speculative precomputation: long-range prefetching of delinquent loads , 2001, ISCA 2001.
[3] William Thies,et al. StreamIt: A Language for Streaming Applications , 2002, CC.
[4] Gurindar S. Sohi,et al. Speculative data-driven multithreading , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.
[5] David Alejandro Padua Haiek. Multiprocessors: discussion of some theoretical and practical problems , 1980 .
[6] Yale N. Patt,et al. Simultaneous subordinate microthreading (SSMT) , 1999, ISCA.
[7] Gurindar S. Sohi,et al. Multiscalar processors , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[8] Guilherme Ottoni,et al. Automatic thread extraction with decoupled software pipelining , 2005, 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05).
[9] Thomas F. Wenisch,et al. TurboSMARTS: accurate microarchitecture simulation sampling in minutes , 2005, SIGMETRICS '05.
[10] Yale N. Patt,et al. Simultaneous subordinate microthreading , 2004 .
[11] David I. August,et al. Microarchitectural exploration with Liberty , 2002, MICRO 35.
[12] Huiyang Zhou,et al. Dual-core execution: building a highly scalable single-thread instruction window , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).
[13] Jean-Luc Gaudiot,et al. Design and evaluation of a hierarchical decoupled architecture , 2006, The Journal of Supercomputing.
[14] Thomas F. Wenisch,et al. SMARTS: accelerating microarchitecture simulation via rigorous statistical sampling , 2003, ISCA '03.
[15] Lizy Kurian John,et al. Efficiently Evaluating Speedup Using Sampled Processor Simulation , 2004, IEEE Computer Architecture Letters.
[16] David I. August,et al. Decoupled software pipelining with the synchronization array , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..
[17] Easwaran Raman,et al. A framework for unrestricted whole-program optimization , 2006, PLDI '06.
[18] Antonia Zhai,et al. The STAMPede approach to thread-level speculation , 2005, TOCS.
[19] Wen-mei W. Hwu,et al. "Flea-flicker" multipass pipelining: an alternative to the high-power out-of-order offense , 2005, 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05).
[20] David I. August,et al. Rapid Development of a Flexible Validated Processor Model , 2004 .
[21] Sanjay J. Patel,et al. Beating in-order stalls with "flea-flicker" two-pass pipelining , 2006, IEEE Transactions on Computers.
[22] G. H. Barnes,et al. A controllable MIMD architecture , 1986 .
[23] Miodrag Potkonjak,et al. MediaBench: a tool for evaluating and synthesizing multimedia and communications systems , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[24] James E. Smith,et al. Decoupled access/execute computer architectures , 1984, TOCS.
[25] Long Li,et al. Automatically partitioning packet processing applications for pipelined architectures , 2005, PLDI '05.
[26] David I. August,et al. The liberty structural specification language: a high-level modeling language for component reuse , 2004, PLDI '04.
[27] Ron Cytron,et al. Doacross: Beyond Vectorization for Multiprocessors , 1986, ICPP.
[28] Guilherme Ottoni,et al. Support for High-Frequency Streaming in CMPs , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[29] S. Vajapeyam,et al. Improving Superscalar Instruction Dispatch And Issue By Exploiting Dynamic Code Sequences , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[30] John Paul Shen,et al. Memory latency-tolerance approaches for Itanium processors: out-of-order execution vs. speculative precomputation , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.
[31] Jian Huang,et al. The Superthreaded Processor Architecture , 1999, IEEE Trans. Computers.
[32] Henry Hoffmann,et al. A stream compiler for communication-exposed architectures , 2002, ASPLOS X.