Reducing exception management overhead with software restart markers
暂无分享,去创建一个
[1] Mikko H. Lipasti,et al. Deconstructing commit , 2004, IEEE International Symposium on - ISPASS Performance Analysis of Systems and Software, 2004.
[2] Mateo Valero,et al. Adding a vector unit to a superscalar processor , 1999, ICS '99.
[3] Henry M. Levy,et al. Hardware and software support for efficient exception handling , 1994, ASPLOS VI.
[4] Edward McLellan. The Alpha AXP architecture and 21064 processor , 1993, IEEE Micro.
[5] Anoop Gupta,et al. The impact of architectural trends on operating system performance , 1995, SOSP.
[6] Andrew R. Pleszkun,et al. Implementing Precise Interrupts in Pipelined Processors , 1988, IEEE Trans. Computers.
[7] Xia Chen,et al. A spatial path scheduling algorithm for EDGE architectures , 2006, ASPLOS XII.
[8] James E. Smith. Retrospective: implementing precise interrupts in pipelined processors , 1998, ISCA '98.
[9] Peter Y.-T. Hsu,et al. Overlapped loop support in the Cydra 5 , 1989, ASPLOS III.
[10] Jaewook Shin,et al. Superword-level parallelism in the presence of control flow , 2005, International Symposium on Code Generation and Optimization.
[11] Peter J. Denning. Virtual Memory , 1996, ACM Comput. Surv..
[12] David I. August,et al. Sentinel Scheduling with Recovery Blocks , 1995 .
[13] Michael Gschwind. The Cell Broadband Engine: Exploiting Multiple Levels of Parallelism in a Chip Multiprocessor , 2007, International Journal of Parallel Programming.
[14] Anand Sivasubramaniam,et al. Characterizing the d-TLB behavior of SPEC CPU2000 benchmarks , 2002, SIGMETRICS '02.
[15] Paolo Faraboschi,et al. Embedded Computing: A VLIW Approach to Architecture, Compilers and Tools , 2004 .
[16] Saman P. Amarasinghe,et al. Exploiting superword level parallelism with multimedia instruction sets , 2000, PLDI '00.
[17] Uri C. Weiser,et al. MMX technology extension to the Intel architecture , 1996, IEEE Micro.
[18] Per Stenström,et al. Limits on Thread-Level Speculative Parallelism in Embedded Applications , 2007 .
[19] D. Marr,et al. Hyper-Threading Technology Architecture and MIcroarchitecture , 2002 .
[20] Francisco J. Cazorla,et al. Kilo-instruction processors: overcoming the memory wall , 2005, IEEE Micro.
[21] James K. Pickett,et al. Enhanced superscalar hardware: The schedule table , 1993, Supercomputing '93. Proceedings.
[22] Balaram Sinharoy,et al. POWER5 system microarchitecture , 2005, IBM J. Res. Dev..
[23] Richard R. Oehler,et al. IBM RISC System/6000 Processor Architecture , 1990, IBM J. Res. Dev..
[24] Robert P. Colwell,et al. Architecture and implementation of a VLIW supercomputer , 1990, Proceedings SUPERCOMPUTING '90.
[25] Mark Jerome Hampton,et al. Exposing datapath elements to reduce microprocessor energy consumption , 2001 .
[26] Andrew W. Appel,et al. Virtual memory primitives for user programs , 1991, ASPLOS IV.
[27] Andrew R. Pleszkun,et al. WISQ: a restartable architecture using queues , 1987, ISCA '87.
[28] Sang Lyul Min,et al. Compiler-assisted demand paging for embedded systems with flash memory , 2004, EMSOFT '04.
[29] Gürhan Küçük,et al. Complexity-effective reorder buffer designs for superscalar processors , 2004, IEEE Transactions on Computers.
[30] Pat Conway,et al. The AMD Opteron Processor for Multiprocessor Servers , 2003, IEEE Micro.
[31] Vittorio Zaccaria,et al. Low-power data forwarding for VLIW embedded architectures , 2002, IEEE Trans. Very Large Scale Integr. Syst..
[32] Trevor N. Mudge,et al. Virtual memory in contemporary microprocessors , 1998, IEEE Micro.
[33] Ken Kennedy,et al. Optimizing Compilers for Modern Architectures: A Dependence-based Approach , 2001 .
[34] David J. Sager,et al. The microarchitecture of the Pentium 4 processor , 2001 .
[35] B. R. Rau,et al. The Cydra 5 Departmental Supercomputer: design philosophies, decisions and trade-offs , 1989, [1989] Proceedings of the Twenty-Second Annual Hawaii International Conference on System Sciences. Volume 1: Architecture Track.
[36] Vladimir M. Pentkovski,et al. Implementing Streaming SIMD Extensions on the Pentium III Processor , 2000, IEEE Micro.
[37] Andrew R. Pleszkun,et al. Implementation of precise interrupts in pipelined processors , 1985, ISCA '98.
[38] David A. Patterson,et al. Computer Architecture - A Quantitative Approach, 5th Edition , 1996 .
[39] Bradley G. Burgess,et al. The PowerPC 603 microprocessor: a high performance, low power, superscalar RISC microprocessor , 1994, Proceedings of COMPCON '94.
[40] David B. Loveman,et al. Program Improvement by Source-to-Source Transformation , 1977, J. ACM.
[41] Susan J. Eggers,et al. Mini-threads: increasing TLP on small-scale SMT processors , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..
[42] Krste Asanovic,et al. Energy-exposed instruction sets , 2002 .
[43] Cameron McNairy,et al. Itanium 2 Processor Microarchitecture , 2003, IEEE Micro.
[44] Balaram Sinharoy,et al. POWER4 system microarchitecture , 2002, IBM J. Res. Dev..
[45] Christopher Batten,et al. Cache Refill/Access Decoupling for Vector Machines , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).
[46] William J. Dally,et al. Imagine: Media Processing with Streams , 2001, IEEE Micro.
[47] Chris Bailey,et al. A mechanism for implementing precise exceptions in pipelined processors , 2004, Euromicro Symposium on Digital System Design, 2004. DSD 2004..
[48] Richard E. Hank,et al. Region-based compilation: an introduction and motivation , 1995, MICRO 1995.
[49] Xiangrong Zhou,et al. Rapid and low-cost context-switch through embedded processor customization for real-time and control applications , 2006, 2006 43rd ACM/IEEE Design Automation Conference.
[50] Ho-Seop Kim,et al. An instruction set and microarchitecture for instruction level distributed processing , 2002, Proceedings 29th Annual International Symposium on Computer Architecture.
[51] Andy D. Pimentel,et al. TriMedia CPU64 architecture , 1999, Proceedings 1999 IEEE International Conference on Computer Design: VLSI in Computers and Processors (Cat. No.99CB37040).
[52] Ernst L. Leiss,et al. Modulo scheduling for the TMS320C6x VLIW DSP architecture , 1999, LCTES '99.
[53] Gurindar S. Sohi,et al. The use of multithreading for exception handling , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.
[54] Nasr Ullah,et al. The MC88110 implementation of precise exceptions in a superscalar architecture , 1993, CARN.
[55] Chris R. Jesshope,et al. A Microthreaded Architecture and its Compiler , 2006 .
[56] Scott A. Mahlke,et al. Trimaran: An Infrastructure for Research in Instruction-Level Parallelism , 2004, LCPC.
[57] Harry Dwyer,et al. An out-of-order superscalar processor with speculative execution and fast, precise interrupts , 1992, MICRO 25.
[58] Anantha P. Chandrakasan,et al. Low-power CMOS digital design , 1992 .
[59] Chong-Min Kyung,et al. New hardware scheme supporting precise exception handling for out-of-order execution , 1994 .
[60] Chia-Jiu Wang,et al. Implementing precise interruptions in pipelined RISC processors , 1993, IEEE Micro.
[61] M. Tremblay,et al. UltraSparc I: a four-issue processor supporting multimedia , 1996, IEEE Micro.
[62] Ryan N. Rakvic,et al. A comprehensive study of hardware/software approaches to improve TLB performance for java applications on embedded systems , 2006, MSPC '06.
[63] Dean M. Tullsen,et al. Simultaneous multithreading: Maximizing on-chip parallelism , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[64] David W. Anderson,et al. The IBM System/360 model 91: machine philosophy and instruction-handling , 1967 .
[65] Mateo Valero,et al. Decoupled vector architectures , 1996, Proceedings. Second International Symposium on High-Performance Computer Architecture.
[66] Sumedh W. Sathaye,et al. A fast interrupt handling scheme for VLIW processors , 1998, Proceedings. 1998 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.98EX192).
[67] Alain J. Martin,et al. Precise exceptions in asynchronous processors , 2001, Proceedings 2001 Conference on Advanced Research in VLSI. ARVLSI 2001.
[68] Bruce D. Lightner,et al. The Metaflow Lightning chipset , 1991, COMPCON Spring '91 Digest of Papers.
[69] Per Stenström,et al. Recency-based TLB preloading , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[70] Ilhoon Shin,et al. SWL: a search-while-load demand paging scheme with NAND flash memory , 2007, LCTES '07.
[71] P.R. Wilson,et al. Pointer swizzling at page fault time: efficiently and compatibly supporting huge address spaces on standard hardware , 1992, [1992] Proceedings of the Second International Workshop on Object Orientation in Operating Systems.
[72] Mateo Valero,et al. Toward kilo-instruction processors , 2004, TACO.
[73] Brad Burgess,et al. A G3 PowerPC/sup TM/ superscalar low-power microprocessor , 1997, Proceedings IEEE COMPCON 97. Digest of Papers.
[74] G. Blanck,et al. The SuperSPARC microprocessor , 1992, Digest of Papers COMPCON Spring 1992.
[75] Nader Vasseghi,et al. The Mips R4000 processor , 1992, IEEE Micro.
[76] David A. Patterson,et al. Computer Architecture: A Quantitative Approach , 1969 .
[77] Kevin O'Brien,et al. Single-program speculative multithreading (SPSM) architecture: compiler-assisted fine-grained multithreading , 1995, PACT.
[78] Chris R. Jesshope. Implementing an efficient vector instruction set in a chip multi-processor using micro-threaded pipelines , 2001, Proceedings 6th Australasian Computer Systems Architecture Conference. ACSAC 2001.
[79] William E. Weihl,et al. Register relocation: flexible contexts for multithreading , 1993, ISCA '93.
[80] Steven W. White,et al. POWER3: The next generation of PowerPC processors , 2000, IBM J. Res. Dev..
[81] Samuel Williams,et al. The Landscape of Parallel Computing Research: A View from Berkeley , 2006 .
[82] Erik Brunvand,et al. Precise exception handling for a self-timed processor , 1995, Proceedings of ICCD '95 International Conference on Computer Design. VLSI in Computers and Processors.
[83] Dave Christie. Developing the AMD-K5 architecture , 1996, IEEE Micro.
[84] Mark Horowitz,et al. Energy dissipation in general purpose microprocessors , 1996, IEEE J. Solid State Circuits.
[85] John Wawrzynek,et al. Vector microprocessors , 1998 .
[86] Vicki H. Allan,et al. Software pipelining , 1995, CSUR.
[87] William J. Dally,et al. The Named-State Register File: implementation and performance , 1995, Proceedings of 1995 1st IEEE Symposium on High Performance Computer Architecture.
[88] Doug Hunt,et al. Advanced performance features of the 64-bit PA-8000 , 1995, Digest of Papers. COMPCON'95. Technologies for the Information Superhighway.
[89] Jaewook Shin. Introducing Control Flow into Vectorized Code , 2007, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).
[90] Ken Kennedy,et al. Conversion of control dependence to data dependence , 1983, POPL '83.
[91] Hwa C. Torng,et al. Interrupt Handling for Out-of-Order Execution Processors , 1993, IEEE Trans. Computers.
[92] Allan Porterfield,et al. The Tera computer system , 1990 .
[93] André Seznec,et al. Out-of-order execution may not be cost-effective on processors featuring simultaneous multithreading , 1999, Proceedings Fifth International Symposium on High-Performance Computer Architecture.
[94] P. Faraboschi,et al. Lx: a technology platform for customizable VLIW embedded processing , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[95] 富田 眞治. 20世紀の名著名論:R. M. Tomasulo : An Efficient Algorithm for Exploiting Multiple Arithmetic Units , 2004 .
[96] H. Peter Hofstee,et al. Power efficient processor architecture and the cell processor , 2005, 11th International Symposium on High-Performance Computer Architecture.
[97] William J. Dally,et al. Concurrent Event Handling through Multithreading , 1999, IEEE Trans. Computers.
[98] Richard L. Sites,et al. Alpha AXP architecture , 1993, CACM.
[99] Trevor N. Mudge,et al. Design Tradeoffs For Software-managed Tlbs , 1994, Proceedings of the 20th Annual International Symposium on Computer Architecture.
[100] Chris R. Jesshope. Implementing an efficient vector instruction set in a chip multi-processor using micro-threaded pipelines , 2001 .
[101] Ronny Krashinsky. Vector-thread architecture and implementation , 2007 .
[102] M. Frans Kaashoek,et al. Software prefetching and caching for translation lookaside buffers , 1994, OSDI '94.
[103] Jaewook Shin,et al. Evaluating compiler technology for control-flow optimizations for multimedia extension architectures , 2009, Microprocess. Microsystems.
[104] Rodric M. Rabbah,et al. Exploiting vector parallelism in software pipelined loops , 2005, 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05).
[105] Rajiv Gupta,et al. Comparison checking: an approach to avoid debugging of optimized code , 1999, ESEC/FSE-7.
[106] Harsh Sharangpani,et al. Itanium Processor Microarchitecture , 2000, IEEE Micro.
[107] Jaehyuk Huh,et al. Exploiting ILP, TLP, and DLP with the Polymorphous TRIPS Architecture , 2003, IEEE Micro.
[108] Wen-mei W. Hwu,et al. Modulo schedule buffers , 2001, Proceedings. 34th ACM/IEEE International Symposium on Microarchitecture. MICRO-34.
[109] Thomas L. Anderson,et al. The cydra 5 minisupercomputer: Architecture and implementation , 1993, The Journal of Supercomputing.
[110] Kathryn S. McKinley,et al. Static placement, dynamic issue (SPDI) scheduling for EDGE architectures , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..
[111] Christoforos E. Kozyrakis,et al. Overcoming the limitations of conventional vector processors , 2003, ISCA '03.
[112] J.F. Martinez,et al. Cherry: Checkpointed early resource recycling in out-of-order microprocessors , 2002, 35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings..
[113] Christopher Batten,et al. The vector-thread architecture , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..
[114] Mansur H. Samadzadeh,et al. Hardware/Software Cost Analysis of Interrupt Processing Strategies , 2001, IEEE Micro.
[115] Marc Tremblay,et al. High-performance throughput computing , 2005, IEEE Micro.
[116] Babak Falsafi,et al. Reference idempotency analysis: a framework for optimizing speculative execution , 2001, PPoPP '01.
[117] Jang-Suk Park,et al. A software-controlled prefetching mechanism for software-managed TLBs , 1995, Microprocess. Microprogramming.
[118] Thomas Thomas,et al. The PowerPC 620 microprocessor: a high performance superscalar RISC microprocessor , 1995, Digest of Papers. COMPCON'95. Technologies for the Information Superhighway.
[119] Milind Girkar,et al. Challenges in exploitation of loop parallelism in embedded applications , 2006, Proceedings of the 4th International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS '06).
[120] S. Peter Song,et al. The PowerPC 604 RISC microprocessor. , 1994, IEEE Micro.
[121] William J. Dally,et al. Compiling for stream processing , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[122] Haitham Akkary,et al. Checkpoint processing and recovery: towards scalable large instruction window processors , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..
[123] Stamatis Vassiliadis,et al. Register renaming and dynamic speculation: an alternative approach , 1993, MICRO.
[124] Steven W. K. Tjiang,et al. SUIF: an infrastructure for research on parallelizing and optimizing compilers , 1994, SIGP.
[125] David B. Whalley,et al. Fast context switches: compiler and architectural support for preemptive scheduling , 1995, Microprocess. Microsystems.
[126] Jerry Huck,et al. Architectural support for translation table management in large address space machines , 1993, ISCA '93.
[127] Andrew Wolfe,et al. A variable instruction stream extension to the VLIW architecture , 1991, ASPLOS IV.
[128] Hunter Scales,et al. AltiVec Extension to PowerPC Accelerates Media Processing , 2000, IEEE Micro.
[129] Dana S. Henry. Adding Fast Interrupts to Superscalar Processors , 2005 .
[130] R. M. Tomasulo,et al. An efficient algorithm for exploiting multiple arithmetic units , 1995 .
[131] Werner Buchholz. The IBM System/370 Vector Architecture , 1986, IBM Syst. J..
[132] Keith D. Underwood,et al. Characterizing a new class of threads in scientific applications for high end supercomputers , 2004, ICS '04.
[133] Gabriel H. Loh,et al. Static strands: safely collapsing dependence chains for increasing embedded power efficiency , 2005, LCTES.
[134] Yale N. Patt,et al. Performance benefits of large execution atomic units in dynamically scheduled machines , 1989, ICS '89.
[135] B. Ramakrishna Rau,et al. Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing , 1981, MICRO 14.
[136] Harvey G. Cragon,et al. Interrupt Processing in Concurrent Processors , 1995, Computer.
[137] Aaron Smith,et al. Compiling for EDGE architectures , 2006, International Symposium on Code Generation and Optimization (CGO'06).
[138] Peter Yan-Tek Hsu. Designing the TFP microprocessor , 1994, IEEE Micro.
[139] Theo Ungerer,et al. A survey of processors with explicit multithreading , 2003, CSUR.
[140] Brian N. Bershad,et al. The interaction of architecture and operating system design , 1991, ASPLOS IV.
[141] Matthew K. Farrens,et al. Code Partitioning in Decoupled Compilers , 2000, Euro-Par.
[142] Mike Johnson,et al. Superscalar microprocessor design , 1991, Prentice Hall series in innovative technology.
[143] Gary Goldman,et al. UltraSPARC-II: the advancement of ultracomputing , 1996, COMPCON '96. Technologies for the Information Superhighway Digest of Papers.
[144] S SohiGurindar. Instruction Issue Logic for High-Performance, Interruptible, Multiple Functional Unit, Pipelined Computers , 1990 .
[145] Antonio González,et al. Energy-effective issue logic , 2001, ISCA 2001.
[146] Colin Whitby-Strevens. The transputer , 1985, ISCA 1985.
[147] Yale N. Patt,et al. Checkpoint repair for out-of-order execution machines , 1987, ISCA '87.
[148] Michael Gschwind,et al. Optimizing Compiler for the CELL Processor , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).
[149] N. Seshan. High VelociTI processing [Texas Instruments VLIW DSP architecture] , 1998 .
[150] Peter F. Sweeney,et al. Multiple page size modeling and optimization , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).
[151] Richard M. Russell,et al. The CRAY-1 computer system , 1978, CACM.
[152] Richard E. Kessler,et al. The Alpha 21264 microprocessor , 1999, IEEE Micro.
[153] Keith Diefendorff. K7 Challenges Intel: 10/26/98 , 1998 .
[154] Aamer Jaleel,et al. In-line interrupt handling and lock-up free translation lookaside buffers (TLBs) , 2006, IEEE Transactions on Computers.
[155] Yasuhiko Hagihara,et al. A hardware overview of SX-6 and SX-7 supercomputer , 2003 .
[156] DeForest Tovey,et al. Microarchitecture of HaL's CPU , 1995, Digest of Papers. COMPCON'95. Technologies for the Information Superhighway.
[157] Trevor N. Mudge,et al. A look at several memory management units, TLB-refill mechanisms, and page table organizations , 1998, ASPLOS VIII.
[158] Josep Llosa,et al. Out-of-order commit processors , 2004, 10th International Symposium on High Performance Computer Architecture (HPCA'04).
[159] Tzi-cker Chiueh,et al. Multi-threaded vectorization , 1991, [1991] Proceedings. The 18th Annual International Symposium on Computer Architecture.
[160] Kenneth C. Yeager. The Mips R10000 superscalar microprocessor , 1996, IEEE Micro.
[161] Stamatis Vassiliadis,et al. Precise Interrupts , 1996, IEEE Micro.
[162] Gary Lauterbach,et al. UltraSPARC-III: designing third-generation 64-bit performance , 1999, IEEE Micro.
[163] Gary Gibson,et al. The Metaflow architecture , 1991, IEEE Micro.
[164] John Paul Shen,et al. Balancing Fine- and Medium-Grained Parallelism in Scheduling Loops for the XIMD Architecture , 1993, Architectures and Compilation Techniques for Fine and Medium Grain Parallelism.
[165] David A. Patterson,et al. Scalable Vector Media-processors for Embedded Systems , 2002 .
[166] John H. Edmondson,et al. Superscalar instruction execution in the 21164 Alpha microprocessor , 1995, IEEE Micro.
[167] Alan E. Charlesworth,et al. An Approach to Scientific Array Processing: The Architectural Design of the AP-120B/FPS-164 Family , 1981, Computer.
[168] Ricardo Bianchini,et al. The MIT Alewife machine: architecture and performance , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[169] Mateo Valero,et al. Out-of-order vector architectures , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[170] Haitham Akkary,et al. Checkpoint Processing and Recovery: An Efficient, Scalable Alternative to Reorder Buffers , 2003, IEEE Micro.
[171] A. Klaiber. The Technology Behind Crusoe TM Processors Low-power x 86-Compatible Processors Implemented with Code Morphing , 2000 .
[172] Corinna G. Lee,et al. Simple vector microprocessors for multimedia applications , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.
[173] Haitham Akkary,et al. Continual flow pipelines , 2004, ASPLOS XI.
[174] Burton J. Smith,et al. A processor architecture for Horizon , 1988, Proceedings. SUPERCOMPUTING '88.
[175] David A. Padua,et al. Advanced compiler optimizations for supercomputers , 1986, CACM.
[176] Kunle Olukotun,et al. Niagara: a 32-way multithreaded Sparc processor , 2005, IEEE Micro.
[177] Trevor Mudge,et al. Improving data cache performance by pre-executing instructions under a cache miss , 1997 .
[178] Scott A. Mahlke,et al. Region-based hierarchical operation partitioning for multicluster processors , 2003, PLDI '03.
[179] Robert P. Colwell,et al. A VLIW architecture for a trace scheduling compiler , 1987, ASPLOS.
[180] Norman P. Jouppi,et al. A simulation based study of TLB performance , 1992, ISCA '92.
[181] G. Kandiraju,et al. Going the distance for TLB prefetching: an application-driven study , 2002, Proceedings 29th Annual International Symposium on Computer Architecture.
[182] Krste Asanovic,et al. Compiling for vector-thread architectures , 2008, CGO '08.
[183] Jian Huang,et al. The Superthreaded Processor Architecture , 1999, IEEE Trans. Computers.
[184] James C. Dehnert,et al. Overlapped loop support in the Cydra 5 , 1989, ASPLOS 1989.
[185] Robert E. Tarjan,et al. Depth-First Search and Linear Graph Algorithms , 1972, SIAM J. Comput..
[186] Trevor N. Mudge,et al. Virtual Memory: Issues of Implementation , 1998, Computer.
[187] Kevin W. Rudd,et al. Efficient Exception Handling Techniques for High-Performance Processor Architectures , 1997 .
[188] Masayuki Ikeda,et al. Architecture of the VPP500 parallel supercomputer , 1994, Proceedings of Supercomputing '94.
[189] Onur Mutlu,et al. Runahead execution: an alternative to very large instruction windows for out-of-order processors , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..
[190] Krste Asanovic,et al. Implementing virtual memory in a vector processor with software restart markers , 2006, ICS '06.
[191] Todd M. Austin,et al. High-Bandwidth Address Translation for Multiple-Issue Processors , 1996, ISCA.
[192] Maurice Herlihy,et al. Transactional Memory: Architectural Support For Lock-free Data Structures , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.
[193] Steven R. Kunkel,et al. A multithreaded PowerPC processor for commercial servers , 2000, IBM J. Res. Dev..
[194] Burton J. Smith. Architecture And Applications Of The HEP Multiprocessor Computer System , 1982, Optics & Photonics.
[195] Matthew Mattina,et al. Tarantula: a vector extension to the alpha architecture , 2002, Proceedings 29th Annual International Symposium on Computer Architecture.
[196] S. Alii,et al. A mechanism for implementing precise exceptions in pipelined processors , 2004 .