System Support for Implicitly Parallel Programming
暂无分享,去创建一个
[1] Rajiv Gupta,et al. Complete removal of redundant expressions , 1998, PLDI 1998.
[2] Eric Rotenberg,et al. Transparent control independence (TCI) , 2007, ISCA '07.
[3] Matthew I. Frank,et al. A Software Framework for Supporting General Purpose Applications on Raw Computation Fabrics , 2001 .
[4] Richard Johnson,et al. The Transmeta Code Morphing/spl trade/ Software: using speculation, recovery, and adaptive retranslation to address real-life challenges , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..
[5] Michael Hind,et al. Loop distribution with multiple exits , 1992, Proceedings Supercomputing '92.
[6] Milo M. K. Martin,et al. SafetyNet: improving the availability of shared memory multiprocessors with global checkpoint/recovery , 2002, Proceedings 29th Annual International Symposium on Computer Architecture.
[7] Alexandru Nicolau,et al. Run-Time Disambiguation: Coping with Statically Unpredictable Dependencies , 1989, IEEE Trans. Computers.
[8] P. Feautrier. Array expansion , 1988 .
[9] Larry Rudolph,et al. The START-VOYAGER parallel system , 1998, Proceedings. 1998 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.98EX192).
[10] Haitham Akkary,et al. Checkpoint processing and recovery: towards scalable large instruction window processors , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..
[11] Joseph A. Fisher,et al. Trace Scheduling: A Technique for Global Microcode Compaction , 1981, IEEE Transactions on Computers.
[12] Todd C. Mowry,et al. The potential for using thread-level data speculation to facilitate automatic parallelization , 1998, Proceedings 1998 Fourth International Symposium on High-Performance Computer Architecture.
[13] Kevin O'Brien,et al. Single-program speculative multithreading (SPSM) architecture: compiler-assisted fine-grained multithreading , 1995, PACT.
[14] David A. Padua,et al. Advanced compiler optimizations for supercomputers , 1986, CACM.
[15] Kunle Olukotun,et al. Data speculation support for a chip multiprocessor , 1998, ASPLOS VIII.
[16] Scott A. Mahlke,et al. Dynamic memory disambiguation using the memory conflict buffer , 1994, ASPLOS VI.
[17] Scott A. Mahlke,et al. Integrated predicated and speculative execution in the IMPACT EPIC architecture , 1998, ISCA.
[18] Sam S. Stone,et al. Address-indexed memory disambiguation and store-to-load forwarding , 2005, 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05).
[19] Haitham Akkary,et al. A dynamic multithreading processor , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.
[20] Eric Rotenberg,et al. Control independence in trace processors , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.
[21] Andrew W. Appel,et al. SSA is functional programming , 1998, SIGP.
[22] Chen Yang,et al. A cost-driven compilation framework for speculative parallelization of sequential programs , 2004, PLDI '04.
[23] Scott A. Mahlke,et al. The superblock: An effective technique for VLIW and superscalar compilation , 1993, The Journal of Supercomputing.
[24] Mendel Rosenblum,et al. Stream programming on general-purpose processors , 2005, 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05).
[25] Anant Agarwal,et al. SUDS: Primitive Mechanisms for Memory Dependence Speculation , 1999 .
[26] David A. Padua,et al. Dependence graphs and compiler optimizations , 1981, POPL '81.
[27] Michael Gschwind,et al. Dynamic Binary Translation and Optimization , 2001, IEEE Trans. Computers.
[28] Jung Ho Ahn,et al. Merrimac: Supercomputing with Streams , 2003, ACM/IEEE SC 2003 Conference (SC'03).
[29] Guy E. Blelloch,et al. A comparison of sorting algorithms for the connection machine CM-2 , 1991, SPAA '91.
[30] Sanjay J. Patel,et al. Performance characterization of a hardware mechanism for dynamic optimization , 2001, Proceedings. 34th ACM/IEEE International Symposium on Microarchitecture. MICRO-34.
[31] ParallelismChih,et al. Compiling Sequential Programs for Speculative , 1993 .
[32] Anant Agarwal,et al. Constructing virtual architectures on a tiled processor , 2006, International Symposium on Code Generation and Optimization (CGO'06).
[33] Kenneth C. Yeager. The Mips R10000 superscalar microprocessor , 1996, IEEE Micro.
[34] David J. Lilja,et al. Coarse-grained speculative execution in shared-memory multiprocessors , 1998, ICS '98.
[35] David A. Padua,et al. Automatic Array Privatization , 1993, Compiler Optimizations for Scalable Parallel Systems Languages.
[36] Robert H. Halstead,et al. Lazy task creation: a technique for increasing the granularity of parallel programs , 1990, LISP and Functional Programming.
[37] Dean M. Tullsen,et al. Control Flow Optimization Via Dynamic Reconvergence Prediction , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).
[38] Josep Torrellas,et al. Hardware and software support for speculative execution of sequential binaries on a chip-multiprocessor , 1998, ICS '98.
[39] Richard A. Kelsey. A correspondence between continuation passing style and static single assignment form , 1995 .
[40] Guy E. Blelloch,et al. Scan primitives for vector computers , 1990, Proceedings SUPERCOMPUTING '90.
[41] Satoshi Matsushita,et al. Pinot: speculative multi-threading processor architecture exploiting parallelism over a wide range of granularities , 2005, 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05).
[42] Satoshi Matsuoka,et al. Highly efficient and encapsulated re-use of synchronization code in concurrent object-oriented languages , 1993, OOPSLA '93.
[43] Bradley C. Kuszmaul,et al. Unbounded Transactional Memory , 2005, HPCA.
[44] Ken Kennedy,et al. Loop distribution with arbitrary control flow , 1990, Proceedings SUPERCOMPUTING '90.
[45] Pete Tinker,et al. Parallel execution of sequential scheme with ParaTran , 1988, LISP and Functional Programming.
[46] K. Ebcioglu,et al. Daisy: Dynamic Compilation For 10o?40 Architectural Compatibility , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[47] Sanjay J. Patel,et al. Implicitly Parallel Programming Models for Thousand-Core Microprocessors , 2007, 2007 44th ACM/IEEE Design Automation Conference.
[48] Chen-Yong Cher,et al. Skipper: a microarchitecture for exploiting control-flow independence , 2001, MICRO.
[49] Andrea C. Arpaci-Dusseau,et al. Fast Parallel Sorting Under LogP: Experience with the CM-5 , 1996, IEEE Trans. Parallel Distributed Syst..
[50] William Thies,et al. Linear analysis and optimization of stream programs , 2003, PLDI '03.
[51] Todd C. Mowry,et al. Tolerating Dependences Between Large Speculative Threads Via Sub-Threads , 2006, ISCA 2006.
[52] Zhiyuan Li. Array privatization for parallel execution of loops , 1992, ICS.
[53] T. N. Vijaykumar,et al. Implicitly-multithreaded processors , 2003, 30th Annual International Symposium on Computer Architecture, 2003. Proceedings..
[54] Eddie Kohler,et al. Programming language optimizations for modular router configurations , 2002, ASPLOS X.
[55] S. S. Stone. Multiversioning in the Store Queue Is the Root of All Store-forwarding Evil , 2022 .
[56] Henry Hoffmann,et al. Evaluation of the Raw microprocessor: an exposed-wire-delay architecture for ILP and streams , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..
[57] Markus Mock,et al. Calpa: a tool for automating selective dynamic compilation , 2000, MICRO 33.
[58] Wen-mei W. Hwu,et al. Field-testing IMPACT EPIC research results in Itanium 2 , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..
[59] Robert H. Halstead,et al. MULTILISP: a language for concurrent symbolic computation , 1985, TOPL.
[60] Brian N. Bershad,et al. Fast, effective dynamic compilation , 1996, PLDI '96.
[61] Bjarne Steensgaard. Sparse functional stores for imperative programs , 1995 .
[62] Burton J. Smith. Architecture And Applications Of The HEP Multiprocessor Computer System , 1982, Optics & Photonics.
[63] Janak H. Patel,et al. Error Recovery in Shared Memory Multiprocessors Using Private Caches , 1990, IEEE Trans. Parallel Distributed Syst..
[64] Milind Girkar,et al. On the performance potential of different types of speculative thread-level parallelism: The DL version of this paper includes corrections that were not made available in the printed proceedings , 2006, ICS '06.
[65] Ron Cytron,et al. What's In a Name? -or- The Value of Renaming for Parallelism Detection and Storage Allocation , 1987, ICPP.
[66] Harish Patil,et al. Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.
[67] Erik R. Altman,et al. Daisy: Dynamic Compilation For 10o?40 Architectural Compatibility , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[68] David A. Wood,et al. LogTM: log-based transactional memory , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..
[69] Monica S. Lam,et al. Array-data flow analysis and its use in array privatization , 1993, POPL '93.
[70] Guy E. Blelloch,et al. Provably efficient scheduling for languages with fine-grained parallelism , 1999, JACM.
[71] Yen-Kuang Chen,et al. The ALPBench benchmark suite for complex multimedia applications , 2005, IEEE International. 2005 Proceedings of the IEEE Workload Characterization Symposium, 2005..
[72] Antonio González,et al. Speculative multithreaded processors , 1998, ICS '98.
[73] L. Rauchwerger,et al. The LRPD Test: Speculative Run-Time Parallelization of Loops with Privatization and Reduction Parallelization , 1999, IEEE Trans. Parallel Distributed Syst..
[74] Guy L. Steele. Debunking the “expensive procedure call” myth or, procedure call implementations considered harmful or, LAMBDA: The Ultimate GOTO , 1977, ACM '77.
[75] Josep Torrellas,et al. ReVive: cost-effective architectural support for rollback recovery in shared-memory multiprocessors , 2002, ISCA.
[76] Josep Torrellas,et al. Removing architectural bottlenecks to the scalability of speculative parallelization , 2001, Proceedings 28th Annual International Symposium on Computer Architecture.
[77] Markus Mock,et al. Dynamic points-to sets: a comparison with static analyses and potential applications in program understanding and optimization , 2001, PASTE '01.
[78] Markus Mock,et al. DyC: an expressive annotation-directed dynamic compiler for C , 2000, Theor. Comput. Sci..
[79] Gurindar S. Sohi,et al. The Expandable Split Window Paradigm for Exploiting Fine-grain Parallelism , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.
[80] Rajiv Gupta,et al. Complete removal of redundant expressions , 1998, PLDI 1998.
[81] Gurindar S. Sohi,et al. Master/Slave Speculative Parallelization , 2002, 35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings..
[82] Babak Falsafi,et al. Implicitly-multithreaded processors , 2003, ISCA '03.
[83] Jenn-Yuan Tsai,et al. The superthreaded architecture: thread pipelining with run-time data dependence checking and control speculation , 1996, Proceedings of the 1996 Conference on Parallel Architectures and Compilation Technique.
[84] Amir Roth,et al. Ginger: control independence using tag rewriting , 2007, ISCA '07.
[85] John V. Guttag,et al. Design and implementation of software radios using a general purpose processor , 1999 .
[86] Matthew I. Frank,et al. SUDS: automatic parallelization for raw processors , 2003 .
[87] Wen-mei W. Hwu,et al. Automatic Discovery of Coarse-Grained Parallelism in Media Applications , 2007, Trans. High Perform. Embed. Archit. Compil..
[88] Wei Liu,et al. Tasking with out-of-order spawn in TLS chip multiprocessors: microarchitecture and compilation , 2005, ICS '05.
[89] Burton H. Bloom,et al. Space/time trade-offs in hash coding with allowable errors , 1970, CACM.
[90] Mayank Agarwal,et al. Exploiting Postdominance for Speculative Parallelization , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.
[91] James R. Larus,et al. EEL: machine-independent executable editing , 1995, PLDI '95.
[92] J.F. Martinez,et al. Cherry: Checkpointed early resource recycling in out-of-order microprocessors , 2002, 35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings..
[93] Sanjay J. Patel,et al. rePLay: A Hardware Framework for Dynamic Optimization , 2001, IEEE Trans. Computers.
[94] Thomas F. Knight. An architecture for mostly functional languages , 1986, LFP '86.
[95] Maurice Herlihy,et al. Virtualizing transactional memory , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[96] Robert D. Blumofe,et al. Scheduling multithreaded computations by work stealing , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.
[97] William J. Dally,et al. The J-machine Multicomputer: An Architectural Evaluation , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.
[98] Yale N. Patt,et al. Checkpoint repair for out-of-order execution machines , 1987, ISCA '87.
[99] Andreas Moshovos,et al. Dynamic Speculation and Synchronization of Data Dependences , 1997, ISCA.
[100] Josep Torrellas,et al. Bulk Disambiguation of Speculative Threads in Multiprocessors , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).
[101] Henry Hoffmann,et al. A stream compiler for communication-exposed architectures , 2002, ASPLOS X.
[102] Craig B. Zilles,et al. Hardware atomicity for reliable software speculation , 2007, ISCA '07.
[103] Guy E. Blelloch,et al. Solving linear recurrences with loop raking , 1992, Proceedings Sixth International Parallel Processing Symposium.