Hybrid multi-core architecture for boosting single-threaded performance
暂无分享,去创建一个
[1] Dan Boneh,et al. Architectural support for copy and tamper resistant software , 2000, SIGP.
[2] Sumedh W. Sathaye,et al. Properties of Rescheduling Size Invariance for Dynamic Rescheduling-Based VLIW Cross-Generation Compatibility , 2000, IEEE Trans. Computers.
[3] Christopher Hughes,et al. Speculative precomputation: long-range prefetching of delinquent loads , 2001, ISCA 2001.
[4] Scott A. Mahlke,et al. The superblock: An effective technique for VLIW and superscalar compilation , 1993, The Journal of Supercomputing.
[5] Steven S. Muchnick,et al. Advanced Compiler Design and Implementation , 1997 .
[6] Rudolf Eigenmann,et al. Min-cut program decomposition for thread-level speculation , 2004, PLDI '04.
[7] B. Ramakrishna Rau,et al. Iterative modulo scheduling: an algorithm for software pipelining loops , 1994, MICRO 27.
[8] Yonghong Song,et al. Design and implementation of a compiler framework for helper threading on multi-core processors , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).
[9] Todd M. Austin,et al. The SimpleScalar tool set, version 2.0 , 1997, CARN.
[10] Guilherme Ottoni,et al. Automatic thread extraction with decoupled software pipelining , 2005, 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05).
[11] Monica S. Lam,et al. In search of speculative thread-level parallelism , 1999, 1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425).
[12] Konrad K. Lai,et al. The Impact of Performance Asymmetry in Emerging Multicore Architectures , 2005, ISCA 2005.
[13] Norman P. Jouppi,et al. Single-ISA heterogeneous multi-core architectures for multithreaded workload performance , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..
[14] John Paul Shen,et al. Post-pass binary adaptation for software-based speculative precomputation , 2002, PLDI '02.
[15] Ravi Rajwar,et al. The impact of performance asymmetry in emerging multicore architectures , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[16] Scott A. Mahlke,et al. Effective compiler support for predicated execution using the hyperblock , 1992, MICRO 25.
[17] John Paul Shen,et al. Speculative Precomputation on Chip Multiprocessors , 2002 .
[18] Quinn Jacobson,et al. Trace processors , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[19] Trevor N. Mudge,et al. ChipLock: support for secure microarchitectures , 2005, CARN.
[20] B. Ramakrishna Rau,et al. Dynamically scheduled VLIW processors , 1993, Proceedings of the 26th Annual International Symposium on Microarchitecture.
[21] H. Peter Hofstee,et al. Introduction to the Cell multiprocessor , 2005, IBM J. Res. Dev..
[22] Erik R. Altman,et al. BOA: The Architecture of a Binary Translation Processor , 1999 .
[23] B. Ramakrishna Rau,et al. EPIC: An Architecture for Instruction-Level Parallel Processors , 2000 .
[24] Microsystems Sun,et al. Jini^ Architecture Specification Version 2.0 , 2003 .
[25] David J. Lilja,et al. Data prefetch mechanisms , 2000, CSUR.
[26] Gurindar S. Sohi,et al. Multiscalar processors , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[27] Utpal Banerjee. Loop Parallelization , 1994, Springer US.
[28] Todd C. Mowry,et al. The potential for using thread-level data speculation to facilitate automatic parallelization , 1998, Proceedings 1998 Fourth International Symposium on High-Performance Computer Architecture.
[29] Scott A. Mahlke,et al. IMPACT: An Architectural Framework for Multiple-Instruction-Issue Processors , 1998, 25 Years ISCA: Retrospectives and Reprints.
[30] David I. August,et al. Chip multi-processor scalability for single-threaded applications , 2005, CARN.
[31] James R. Larus,et al. Branch prediction for free , 1993, PLDI '93.
[32] B. R. Rau,et al. HPL-PD Architecture Specification:Version 1.1 , 2000 .
[33] Paolo Faraboschi,et al. Embedded Computing: A VLIW Approach to Architecture, Compilers and Tools , 2004 .
[34] Kunle Olukotun,et al. The case for a single-chip multiprocessor , 1996, ASPLOS VII.
[35] Joseph A. Fisher,et al. Very long instruction work architectures and the ELI-512 , 1983, ISCA '98.
[36] Scott Mahlke,et al. Effective compiler support for predicated execution using the hyperblock , 1992, MICRO 1992.
[37] Chi-Keung Luk,et al. Tolerating memory latency through software-controlled pre-execution in simultaneous multithreading processors , 2001, Proceedings 28th Annual International Symposium on Computer Architecture.
[38] Karthikeyan Sankaralingam,et al. A design space evaluation of grid processor architectures , 2001, MICRO.
[39] Henry Hoffmann,et al. The Raw Microprocessor: A Computational Fabric for Software Circuits and General-Purpose Programs , 2002, IEEE Micro.
[40] S. Sudharsanan,et al. Image and video processing using MAJC 5200 , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).
[41] Kunle Olukotun,et al. Using thread-level speculation to simplify manual parallelization , 2003, PPoPP '03.
[42] Guang R. Gao,et al. Design and Implementation of an Efficient Thread Partitioning Algorithm , 2000, ISHPC.
[43] Jian Huang,et al. The Superthreaded Processor Architecture , 1999, IEEE Trans. Computers.