Program Demultiplexing: Data-flow based Speculative Parallelization of Methods in Sequential Programs
暂无分享,去创建一个
[1] B. Ramakrishna Rau,et al. EPIC: Explicititly Parallel Instruction Computing , 2000, Computer.
[2] Frank Tip,et al. A survey of program slicing techniques , 1994, J. Program. Lang..
[3] Uri C. Weiser,et al. MMX technology extension to the Intel architecture , 1996, IEEE Micro.
[4] Vikram S. Adve,et al. LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..
[5] T. W. Christopher,et al. Early experience with object-oriented message driven computing , 1990, [1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation.
[6] Hunter Scales,et al. AltiVec Extension to PowerPC Accelerates Media Processing , 2000, IEEE Micro.
[7] Andrew R. Pleszkun,et al. Implementing Precise Interrupts in Pipelined Processors , 1988, IEEE Trans. Computers.
[8] Antonio González,et al. Clustered speculative multithreaded processors , 1999, ICS '99.
[9] James R. Larus. C**: A Large-Grain, Object-Oriented, Data-Parallel Programming Language , 1992, LCPC.
[10] McNairyCameron,et al. Itanium 2 Processor Microarchitecture , 2003 .
[11] Steve Johnson,et al. Compiling C for vectorization, parallelization, and inline expansion , 1988, PLDI '88.
[12] D. Geer,et al. Chip makers turn to multicore processors , 2005, Computer.
[13] James R. Goodman,et al. Transactional lock-free execution of lock-based programs , 2002, ASPLOS X.
[14] Jeffrey Su,et al. A dual-core 64-bit ultraSPARC microprocessor for dense server applications , 2004, IEEE Journal of Solid-State Circuits.
[15] David J. Sager,et al. The microarchitecture of the Pentium 4 processor , 2001 .
[16] Jeffrey S. Chase,et al. The Amber system: parallel programming on a network of multiprocessors , 1989, SOSP '89.
[17] Richard E. Kessler,et al. The Alpha 21264 microprocessor , 1999, IEEE Micro.
[18] V. G. Grafe,et al. The Epsilon dataflow processor , 1989, ISCA '89.
[19] Kunle Olukotun,et al. Programming with transactional coherence and consistency (TCC) , 2004, ASPLOS XI.
[20] H BloomBurton. Space/time trade-offs in hash coding with allowable errors , 1970 .
[21] Mark Scott Johnson. Some requirements for architectural support of software debugging , 1982, ASPLOS I.
[22] James R. McGraw,et al. The VAL Language: Description and Analysis , 1982, TOPL.
[23] Suresh Jagannathan,et al. Safe futures for Java , 2005, OOPSLA '05.
[24] Kunle Olukotun,et al. The Jrpm system for dynamically parallelizing Java programs , 2003, ISCA '03.
[25] Chen Yang,et al. A cost-driven compilation framework for speculative parallelization of sequential programs , 2004, PLDI '04.
[26] K. M. George,et al. Parallelizing translator for an object-oriented parallel programming language , 1991, [1991 Proceedings] Tenth Annual International Phoenix Conference on Computers and Communications.
[27] Bradley C. Kuszmaul,et al. Cilk: an efficient multithreaded runtime system , 1995, PPOPP '95.
[28] Utpal Banerjee,et al. Dependence analysis for supercomputing , 1988, The Kluwer international series in engineering and computer science.
[29] Gurindar S. Sohi,et al. Instruction issue logic for high-performance, interruptable pipelined processors , 1987, ISCA '87.
[30] Christopher J. Hughes,et al. Hybrid transactional memory , 2006, PPoPP '06.
[31] Pat Hanrahan,et al. Brook for GPUs: stream computing on graphics hardware , 2004, SIGGRAPH 2004.
[32] Monica S. Lam,et al. Interprocedural parallelization analysis in SUIF , 2005, TOPL.
[33] Jong-Deok Choi,et al. The Jalape�o Dynamic Optimizing Compiler for JavaTM , 1999, JAVA '99.
[34] Aart J. C. Bik. Software Vectorization Handbook, The: Applying Intel Multimedia Extensions for Maximum Performance , 2004 .
[35] Balaram Sinharoy,et al. POWER4 system microarchitecture , 2002, IBM J. Res. Dev..
[36] Milind Girkar. Functional parallelism: theoretical foundations and implementation , 1992 .
[37] K. Mani Chandy,et al. Compositional C++: Compositional Parallel Programming , 1992, LCPC.
[38] Monica S. Lam,et al. In search of speculative thread-level parallelism , 1999, 1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425).
[39] Vivek Sarkar,et al. Partitioning parallel programs for macro-dataflow , 1986, LFP '86.
[40] Bantwal R. Rau. Dynamically scheduled VLIW processors , 1993, MICRO 1993.
[41] Vikram S. Adve,et al. LLVA: a low-level virtual instruction set architecture , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..
[42] Wing Cheong Lau,et al. An Object-Oriented Class Library for Scalable Parallel Heuristic Search , 1992, ECOOP.
[43] Rohit Bhatia,et al. Montecito: a dual-core, dual-thread Itanium processor , 2005, IEEE Micro.
[44] A. A. Chien,et al. Object-oriented concurrent programming in CST , 1988, C3P.
[45] David L. Weaver,et al. The SPARC Architecture Manual , 2003 .
[46] Gul A. Agha,et al. HAL: A High-Level Actor Language and Its Distributed Implementation , 1992, ICPP.
[47] SankaralingamKarthikeyan,et al. Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture , 2003 .
[48] Milind Girkar,et al. Extracting task-level parallelism , 1995, TOPL.
[49] Maurice Herlihy,et al. Transactional Memory: Architectural Support For Lock-free Data Structures , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.
[50] Matthew Mattina,et al. Tarantula: a vector extension to the alpha architecture , 2002, Proceedings 29th Annual International Symposium on Computer Architecture.
[51] David W. Binkley,et al. Program slicing , 2008, 2008 Frontiers of Software Maintenance.
[52] P. Hudak,et al. Implementing functional programs on a hypercube multiprocessor , 1988, C3P.
[53] Arvind,et al. Executing a Program on the MIT Tagged-Token Dataflow Architecture , 1987, IEEE Trans. Computers.
[54] Robert P. Colwell,et al. Architecture and implementation of a VLIW supercomputer , 1990, Proceedings SUPERCOMPUTING '90.
[55] Manoj Franklin,et al. The multiscalar architecture , 1993 .
[56] David E. Culler,et al. Compiler-Controlled Multithreading for Lenient Parallel Languages , 1991, FPCA.
[57] Robert H. Halstead,et al. MULTILISP: a language for concurrent symbolic computation , 1985, TOPL.
[58] Ken Kennedy,et al. Interprocedural transformations for parallel code generation , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).
[59] Brian N. Bershad,et al. Fast, effective dynamic compilation , 1996, PLDI '96.
[60] R. P. Colwell,et al. A 0.6 /spl mu/m BiCMOS processor with dynamic execution , 1995, Proceedings ISSCC '95 - International Solid-State Circuits Conference.
[61] Eduard Ayguadé,et al. Increasing effective IPC by exploiting distant parallelism , 1999, ICS '99.
[62] Matthew Arnold,et al. Adaptive optimization in the Jalapeño JVM , 2000, OOPSLA '00.
[63] Zhiyuan Li,et al. Efficient interprocedural analysis for program parallelization and restructuring , 1988, PPEALS '88.
[64] Yale N. Patt,et al. Difficult-path branch prediction using subordinate microthreads , 2002, ISCA.
[65] Cameron McNairy,et al. Itanium 2 Processor Microarchitecture , 2003, IEEE Micro.
[66] Vivek Sarkar,et al. Automatic discovery of parallelism: a tool and an experiment (extended abstract) , 1988, PPoPP 1988.
[67] Pen-Chung Yew,et al. Efficient interprocedural analysis for program parallelization and restructuring , 1988, PPoPP 1988.
[68] David A. Patterson,et al. Computer Architecture: A Quantitative Approach , 1969 .
[69] Eric Rotenberg,et al. AR-SMT: a microarchitectural approach to fault tolerance in microprocessors , 1999, Digest of Papers. Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing (Cat. No.99CB36352).
[70] Todd C. Mowry,et al. The potential for using thread-level data speculation to facilitate automatic parallelization , 1998, Proceedings 1998 Fourth International Symposium on High-Performance Computer Architecture.
[71] Nancy M. Amato,et al. Run-time methods for parallelizing partially parallel loops , 1995, ICS '95.
[72] Kevin O'Brien,et al. Single-program speculative multithreading (SPSM) architecture: compiler-assisted fine-grained multithreading , 1995, PACT.
[73] Wei Liu,et al. POSH: a TLS compiler that exploits program structure , 2006, PPoPP '06.
[74] Todd M. Austin,et al. Dynamic dependency analysis of ordinary programs , 1992, ISCA '92.
[75] Jack B. Dennis,et al. A preliminary architecture for a basic data-flow processor , 1974, ISCA '98.
[76] Laxmikant V. Kalé,et al. CHARM++: a portable concurrent object oriented system based on C++ , 1993, OOPSLA '93.
[77] Olivier Temam,et al. CAPSULE: Hardware-Assisted Parallel Execution of Component-Based Programs , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[78] Mark Moir,et al. Transparent Support for Wait-Free Transactions , 1997, WDAG.
[79] Yale N. Patt,et al. HPS, a new microarchitecture: rationale and introduction , 1985, MICRO 18.
[80] Dennis Gannon,et al. Distributed pC++ Basic Ideas for an Object Parallel Language , 1993, Sci. Program..
[81] James M. Stichnoth,et al. Practicing JUDO: Java under dynamic optimizations , 2000, PLDI '00.
[82] Utpal Banerjee,et al. Speedup of ordinary programs , 1979 .
[83] Dean M. Tullsen,et al. Mitosis compiler: an infrastructure for speculative threading based on pre-computation slices , 2005, PLDI '05.
[84] John Paul Shen,et al. Dynamic speculative precomputation , 2001, Proceedings. 34th ACM/IEEE International Symposium on Microarchitecture. MICRO-34.
[85] Henry Hoffmann,et al. A stream compiler for communication-exposed architectures , 2002, ASPLOS X.
[86] Harsh Sharangpani,et al. Itanium Processor Microarchitecture , 2000, IEEE Micro.
[87] Kenji Nishida,et al. Evaluation of a prototype data flow processor of the SIGMA-1 for scientific computations , 1986, ISCA 1986.
[88] Mark Moir,et al. Hybrid transactional memory , 2006, ASPLOS XII.
[89] David J. Lilja. Exploiting the parallelism available in loops , 1994, Computer.
[90] Frank Yellin,et al. The Java Virtual Machine Specification , 1996 .
[91] Robert P. Colwell,et al. A VLIW architecture for a trace scheduling compiler , 1987, ASPLOS 1987.
[92] David E. Culler,et al. Monsoon: an explicit token-store architecture , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.
[93] Wei Liu,et al. Tasking with out-of-order spawn in TLS chip multiprocessors: microarchitecture and compilation , 2005, ICS '05.
[94] Mark Weiser,et al. Program Slicing , 1981, IEEE Transactions on Software Engineering.
[95] Ken Kennedy,et al. A technique for summarizing data access and its use in parallelism enhancing transformations , 1989, PLDI '89.
[96] Scott A. Mahlke,et al. IMPACT: an architectural framework for multiple-instruction-issue processors , 1991, ISCA '91.
[97] Rudolf Eigenmann,et al. Automatic program parallelization , 1993, Proc. IEEE.
[98] Constantine Demetrios Polychronopoulos. On program restructuring, scheduling, and communication for parallel processor systems , 1986 .
[99] Burton H. Bloom,et al. Space/time trade-offs in hash coding with allowable errors , 1970, CACM.
[100] Hiroshi Yasuhara,et al. DDDP-a Distributed Data Driven Processor , 1983, ISCA '83.
[101] Jaehyuk Huh,et al. Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture , 2003, ISCA '03.
[102] C. Zilles,et al. Time-Shifted Modules : Exploiting Code Modularity for Fine Grain Parallelization , 2000 .
[103] Vivek Sarkar,et al. X10: an object-oriented approach to non-uniform cluster computing , 2005, OOPSLA '05.
[104] Harish Patil,et al. Efficient Run-time Monitoring Using Shadow Processing , 1995, AADEBUG.
[105] Paul Feautrier,et al. Direct parallelization of call statements , 1986, SIGPLAN '86.
[106] Luca Cardelli,et al. Modern concurrency abstractions for C# , 2002, TOPL.
[107] Donald E. Knuth,et al. The Art of Computer Programming, Volume I: Fundamental Algorithms, 2nd Edition , 1997 .
[108] Kunle Olukotun,et al. Characterization of TCC on chip-multiprocessors , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).
[109] Ron Cytron,et al. Interprocedural dependence analysis and parallelization , 1986, SIGP.
[110] Suresh Jagannathan,et al. Transactional Monitors for Concurrent Objects , 2004, ECOOP.
[111] Antonia Zhai,et al. A scalable approach to thread-level speculation , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[112] Per Stenström,et al. Limits on speculative module-level parallelism in imperative and object-oriented programs on CMP platforms , 2001, Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques.
[113] Anastasia Ailamaki,et al. Tolerating Dependences Between Large Speculative Threads Via Sub-Threads , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).
[114] Thomas F. Knight. An architecture for mostly functional languages , 1986, LFP '86.
[115] Craig Zilles,et al. Execution-based prediction using speculative slices , 2001, ISCA 2001.
[116] Rishiyur S. Nikhil,et al. The Parallel Programming Language Id and its Compilation for Parallel Machines , 1993, Int. J. High Speed Comput..
[117] Josep Torrellas,et al. Hardware for speculative parallelization of partially-parallel loops in DSM multiprocessors , 1999, Proceedings Fifth International Symposium on High-Performance Computer Architecture.
[118] Kunle Olukotun,et al. Exposing speculative thread parallelism in SPEC2000 , 2005, PPoPP.
[119] Per Stenström,et al. Reducing misspeculation overhead for module-level speculative execution , 2005, CF '05.
[120] David F. Bacon,et al. Compiler transformations for high-performance computing , 1994, CSUR.
[121] David A. Padua,et al. Advanced compiler optimizations for supercomputers , 1986, CACM.
[122] Kunle Olukotun,et al. Niagara: a 32-way multithreaded Sparc processor , 2005, IEEE Micro.
[123] Kunle Olukotun,et al. Data speculation support for a chip multiprocessor , 1998, ASPLOS VIII.
[124] C. Zilles,et al. Understanding the backward slices of performance degrading instructions , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[125] Yale N. Patt,et al. Simultaneous subordinate microthreading (SSMT) , 1999, ISCA.
[126] Andrew S. Grimshaw,et al. Easy-to-use object-oriented parallel processing with Mentat , 1993, Computer.
[127] Quinn Jacobson,et al. Architectural Support for Software Transactional Memory , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[128] Bernd Mohr,et al. Performance analysis of pC++: a portable data-parallel programming system for scalable parallel computers , 1994, Proceedings of 8th International Parallel Processing Symposium.
[129] Satoshi Matsushita,et al. Pinot: speculative multi-threading processor architecture exploiting parallelism over a wide range of granularities , 2005, 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05).
[130] James R. Larus,et al. Software and the Concurrency Revolution , 2005, ACM Queue.
[131] Matthew Arnold,et al. Online feedback-directed optimization of Java , 2002, OOPSLA '02.
[132] Josep Torrellas,et al. An efficient algorithm for the run-time parallelization of DOACROSS loops , 1994, Proceedings of Supercomputing '94.
[133] Andreas Moshovos,et al. Dependence based prefetching for linked data structures , 1998, ASPLOS VIII.
[134] Balaram Sinharoy,et al. IBM Power5 chip: a dual-core multithreaded processor , 2004, IEEE Micro.
[135] David A. Wood,et al. LogTM: log-based transactional memory , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..
[136] Monica S. Lam,et al. Array-data flow analysis and its use in array privatization , 1993, POPL '93.
[137] Nir Shavit,et al. Software transactional memory , 1995, PODC '95.
[138] Luis Ceze,et al. Implicit parallelism with ordered transactions , 2007, PPoPP.
[139] Mark Weiser,et al. Programmers use slices when debugging , 1982, CACM.
[140] Mayank Agarwal,et al. Exploiting Postdominance for Speculative Parallelization , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.
[142] Antonio González,et al. Thread-spawning schemes for speculative multithreading , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.
[143] Carl Hewitt,et al. Viewing Control Structures as Patterns of Passing Messages , 1977, Artif. Intell..
[144] Wei Liu,et al. AccMon: Automatically Detecting Memory-Related Bugs via Program Counter-Based Invariants , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).
[145] Lionel M. Ni,et al. Dependence Uniformization: A Loop Parallelization Technique , 1993, IEEE Trans. Parallel Distributed Syst..
[146] Zhiyuan Li,et al. Array privatization for parallel execution of loops , 1992 .
[147] Guilherme Ottoni,et al. Automatic thread extraction with decoupled software pipelining , 2005, 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05).
[148] Chi-Keung Luk,et al. Tolerating memory latency through software-controlled pre-execution in simultaneous multithreading processors , 2001, Proceedings 28th Annual International Symposium on Computer Architecture.
[149] Josep Torrellas,et al. Bulk Disambiguation of Speculative Threads in Multiprocessors , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).
[150] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[151] Christoforos E. Kozyrakis,et al. Evaluating MapReduce for Multi-core and Multiprocessor Systems , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.
[152] Rudolf Eigenmann,et al. Min-cut program decomposition for thread-level speculation , 2004, PLDI '04.
[153] Haitham Akkary,et al. A dynamic multithreading processor , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.
[154] Eric Rotenberg,et al. Slipstream processors: improving both performance and fault tolerance , 2000, SIGP.
[155] Andreas Moshovos,et al. Improving virtual function call target prediction via dependence-based pre-computation , 1999, ICS '99.
[156] David A. Patterson,et al. Computer Architecture - A Quantitative Approach, 5th Edition , 1996 .
[157] Donald Ervin Knuth,et al. The Art of Computer Programming , 1968 .
[158] Rudolf Eigenmann,et al. Speculative thread decomposition through empirical optimization , 2007, PPoPP.
[159] Olivier Temam,et al. Dataflow analysis of branch mispredictions and its application to early resolution of branch outcomes , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.
[160] T. Yuba,et al. An architecture of a dataflow single chip processor , 1989, ISCA '89.
[161] R. M. Tomasulo,et al. An efficient algorithm for exploiting multiple arithmetic units , 1995 .
[162] Harish Patil,et al. Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.
[163] Balaram Sinharoy,et al. Design and implementation of the POWER5 microprocessor , 2004, Proceedings. 41st Design Automation Conference, 2004..
[164] Donald Yeung,et al. Design and evaluation of compiler algorithms for pre-execution , 2002, ASPLOS X.
[165] Henry Hoffmann,et al. Evaluation of the Raw microprocessor: an exposed-wire-delay architecture for ILP and streams , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..
[166] Andrew A. Chien,et al. Concurrent aggregates (CA) , 1990, PPOPP '90.
[167] Antonia Zhai,et al. Improving value communication for thread-level speculation , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.
[168] Kenji Nishida,et al. Evaluation of a Prototype Data Flow Processor of the SIGMA-1 for Scientific Computations , 1986, ISCA.
[169] Gavin M. Bierman,et al. The Essence of Data Access in Comega , 2005, European Conference on Object-Oriented Programming.
[170] Joseph A. Fisher,et al. Very Long Instruction Word architectures and the ELI-512 , 1983, ISCA '83.
[171] Ian Watson,et al. The Manchester prototype dataflow computer , 1985, CACM.
[172] Pierre America,et al. Issues in the design of a parallel object-oriented language , 1989, Formal Aspects of Computing.
[173] James E. Smith,et al. An instruction set and microarchitecture for instruction level distributed processing , 2002, ISCA.
[174] Nicholas Carriero,et al. Linda in context , 1989, CACM.
[175] John Paul Shen,et al. Mitosis: A Speculative Multithreaded Processor Based on Precomputation Slices , 2008, IEEE Transactions on Parallel and Distributed Systems.
[176] Ken Kennedy,et al. Loop distribution with arbitrary control flow , 1990, Proceedings SUPERCOMPUTING '90.
[177] Marc Tremblay,et al. The MAJC Architecture: A Synthesis of Parallelism and Scalability , 2000, IEEE Micro.
[178] Bob Iannucci. Toward a dataflow/von Neumann hybrid architecture , 1988, [1988] The 15th Annual International Symposium on Computer Architecture. Conference Proceedings.
[179] Christopher Hughes,et al. Speculative precomputation: long-range prefetching of delinquent loads , 2001, ISCA 2001.
[180] Fredrik Larsson,et al. Simics: A Full System Simulation Platform , 2002, Computer.
[181] J. Ramanujam,et al. A methodology for parallelizing programs for multicomputers and complex memory multiprocessors , 1989, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).
[182] Thomas Rauber,et al. The shared-memory language pSather on a distributed-memory multiprocessor , 1993, SIGP.
[183] Josep Torrellas,et al. ReEnact: using thread-level speculation mechanisms to debug data races in multithreaded codes , 2003, ISCA '03.
[184] Scott A. Mahlke,et al. Extending Multicore Architectures to Exploit Hybrid Parallelism in Single-thread Applications , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.
[185] Josep Torrellas,et al. Architectural support for scalable speculative parallelization in shared-memory multiprocessors , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[186] Todd C. Mowry,et al. Hardware support for thread-level speculation , 2003 .
[187] Jack B. Dennis,et al. First version of a data flow procedure language , 1974, Symposium on Programming.
[188] Fred Weber,et al. AMD 3DNow! technology: architecture and implementations , 1999, IEEE Micro.
[189] Koen De Bosschere,et al. LANCET: a nifty code editing tool , 2005, PASTE '05.
[190] Kunle Olukotun,et al. Exploiting method-level parallelism in single-threaded Java programs , 1998, Proceedings. 1998 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.98EX192).
[191] Monica S. Lam,et al. Limits of control flow on parallelism , 1992, ISCA '92.
[192] Kathryn S. McKinley. Evaluating automatic parallelization for efficient execution on shared-memory multiprocessors , 1994, ICS '94.
[193] Zhiyuan Li. Array privatization for parallel execution of loops , 1992, ICS.
[194] Wilson C. Hsieh,et al. Automatic generation of DAG parallelism , 1989, PLDI '89.
[195] Alexandru Nicolau,et al. Parallelizing Programs with Recursive Data Structures , 1989, IEEE Trans. Parallel Distributed Syst..
[196] Antonio González,et al. A quantitative assessment of thread-level speculation techniques , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.
[197] Wei Liu,et al. iWatcher: efficient architectural support for software debugging , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..
[198] A. L. Davis,et al. The architecture and system method of DDM1: A recursively structured Data Driven Machine , 1978, ISCA '78.
[199] Michael J. Flynn,et al. Some Computer Organizations and Their Effectiveness , 1972, IEEE Transactions on Computers.
[200] Ravi Rajwar,et al. Speculative lock elision: enabling highly concurrent multithreaded execution , 2001, Proceedings. 34th ACM/IEEE International Symposium on Microarchitecture. MICRO-34.
[201] Dionisios N. Pnevmatikatos,et al. Slice-processors: an implementation of operation-based prediction , 2001, ICS '01.
[202] Gurindar S. Sohi,et al. Speculative data-driven multithreading , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.
[203] Ken Kennedy,et al. Automatic decomposition of scientific programs for parallel execution , 1987, POPL '87.
[204] Arvind,et al. Executing a Program on the MIT Tagged-Token Dataflow Architecture , 1990, IEEE Trans. Computers.
[205] Wolfram Schulte,et al. The essence of data access in Cω: the power is in the dot! , 2005 .
[206] Leslie Lamport,et al. The parallel execution of DO loops , 1974, CACM.
[207] B. Ramakrishna Rau,et al. Instruction-level parallel processing: History, overview, and perspective , 2005, The Journal of Supercomputing.
[208] Gurindar S. Sohi,et al. Master/slave speculative parallelization and approximate code , 2002 .
[209] Gurindar S. Sohi,et al. Speculative Multithreaded Processors , 2001, Computer.
[210] Antonio González,et al. Speculative multithreaded processors , 1998, ICS '98.
[211] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.
[212] Kunle Olukotun,et al. Using thread-level speculation to simplify manual parallelization , 2003, PPoPP '03.
[213] Maurice Herlihy,et al. Virtualizing Transactional Memory , 2005, ISCA 2005.
[214] Milind Girkar,et al. Automatic Extraction of Functional Parallelism from Ordinary Programs , 1992, IEEE Trans. Parallel Distributed Syst..
[215] Sanjay Ghemawat,et al. MapReduce: simplified data processing on large clusters , 2008, CACM.
[216] Jian Huang,et al. The Superthreaded Processor Architecture , 1999, IEEE Trans. Computers.
[217] Kunle Olukotun,et al. Transactional memory coherence and consistency , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..
[218] 米沢 明憲. ABCL : an object-oriented concurrent system , 1990 .
[219] Bradley C. Kuszmaul,et al. Unbounded transactional memory , 2005, 11th International Symposium on High-Performance Computer Architecture.
[220] Gurindar S. Sohi,et al. The Expandable Split Window Paradigm for Exploiting Fine-grain Parallelism , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.
[221] Gurindar S. Sohi,et al. Master/Slave Speculative Parallelization , 2002, 35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings..
[222] Derek Bruening,et al. An infrastructure for adaptive dynamic optimization , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..
[223] Babak Falsafi,et al. Implicitly-multithreaded processors , 2003, ISCA '03.