Memory consistency models for shared memory multiprocessors
暂无分享,去创建一个
[1] Anoop Gupta,et al. Sufficient System Requirements for Supporting the PLpc Memory Model , 1993 .
[2] Jong-Deok Choi,et al. An efficient cache-based access anomaly detection scheme , 1991, ASPLOS IV.
[3] David B. Gustavson,et al. Scalable Coherent Interface , 1990, COMPEURO'90: Proceedings of the 1990 IEEE International Conference on Computer Systems and Software Engineering@m_Systems Engineering Aspects of Complex Computerized Systems.
[4] Yehuda Afek,et al. A lazy cache algorithm , 1989, SPAA '89.
[5] Barton P. Miller,et al. Improving the accuracy of data race detection , 1991, PPOPP '91.
[6] T. Mowry,et al. Comparative evaluation of latency reducing and tolerating techniques , 1991, [1991] Proceedings. The 18th Annual International Symposium on Computer Architecture.
[7] Kourosh Gharachorloo,et al. Detecting violations of sequential consistency , 1991, SPAA '91.
[8] David L. Dill,et al. An executable specification, analyzer and verifier for RMO (relaxed memory order) , 1995, SPAA '95.
[9] Anoop Gupta,et al. The Stanford Dash multiprocessor , 1992, Computer.
[10] A. Gupta,et al. Parallel distributed-time logic simulation , 1989, IEEE Design & Test of Computers.
[11] William M. Johnson,et al. Super-scalar processor design , 1989 .
[12] Michael D. Smith,et al. Boosting beyond static scheduling in a superscalar processor , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.
[13] Anant Agarwal,et al. Closing the window of vulnerability in multiphase memory transactions , 1992, ASPLOS V.
[14] Daniel E. Lenoski,et al. The design and analysis of DASH: a scalable directory-based multiprocessor , 1992 .
[15] Anoop Gupta,et al. Tolerating Latency Through Software-Controlled Prefetching in Shared-Memory Multiprocessors , 1991, J. Parallel Distributed Comput..
[16] Mike Johnson,et al. Superscalar microprocessor design , 1991, Prentice Hall series in innovative technology.
[17] Michel Dubois,et al. Memory access buffering in multiprocessors , 1998, ISCA '98.
[18] Yale N. Patt,et al. Exploiting Fine-Grained Parallelism Through a Combination of Hardware and Software Techniques , 1991, ISCA.
[19] J.P. Singh. Implications of Hierarchical N-body Methods for Multiprocessor Architecture , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.
[20] Samuel P. Midkiff,et al. Compiling programs with user parallelism , 1990 .
[21] J. Mcdonald,et al. Vectorization of a particle simulation method for hypersonic rarefied flow , 1988 .
[22] Leslie Lamport,et al. How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs , 2016, IEEE Transactions on Computers.
[23] Katherine A. Yelick,et al. Optimizing parallel programs with explicit synchronization , 1995, PLDI '95.
[24] Roy Friedman,et al. Shared memory consistency conditions for non-sequential execution: definitions and programming strategies , 1993, SPAA '93.
[25] James R. Goodman,et al. Cache Consistency and Sequential Consistency , 1991 .
[26] Francisco Corella,et al. Specification of the powerpc shared memory architecture , 1993 .
[27] Cathy May,et al. The PowerPC Architecture: A Specification for a New Family of RISC Processors , 1994 .
[28] Michael Stumm,et al. Cache consistency in hierarchical-ring-based multiprocessors , 1992, Proceedings Supercomputing '92.
[29] Alfred V. Aho,et al. Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.
[30] David A. Patterson,et al. Computer Architecture: A Quantitative Approach , 1969 .
[31] Anoop Gupta,et al. Specifying system requirements for memory consistency models , 1993 .
[32] Josep Torrellas,et al. False Sharing ans Spatial Locality in Multiprocessor Caches , 1994, IEEE Trans. Computers.
[33] Richard P. LaRowe,et al. Hiding Shared Memory Reference Latency on the Galactica Net Distributed Shared Memory Architecture , 1992, J. Parallel Distributed Comput..
[34] Sarita V. Adve,et al. Designing memory consistency models for shared-memory multiprocessors , 1993 .
[35] David Padua,et al. Debugging Fortran on a shared memory machine , 1987 .
[36] Roy Friedman,et al. A Correctness Condition for High-Performance Multiprocessors , 1998, SIAM J. Comput..
[37] David W. Wall,et al. Link-Time Code Modification , 1989 .
[38] Paul Hudak,et al. Memory coherence in shared virtual memory systems , 1986, PODC '86.
[39] Michel Dubois,et al. Access ordering and coherence in shared memory multiprocessors , 1989 .
[40] Michael L. Scott,et al. Algorithms for scalable synchronization on shared-memory multiprocessors , 1991, TOCS.
[41] Alan L. Cox,et al. Lazy release consistency for software distributed shared memory , 1992, ISCA '92.
[42] Jonathan Rose. LocusRoute: a parallel global router for standard cells , 1988, 25th ACM/IEEE, Design Automation Conference.Proceedings 1988..
[43] Mark D. Hill,et al. A Unified Formalization of Four Shared-Memory Models , 1993, IEEE Trans. Parallel Distributed Syst..
[44] Anant Agarwal,et al. APRIL: a processor architecture for multiprocessing , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.
[45] Michel Dubois,et al. Correct memory operation of cache-based multiprocessors , 1987, ISCA '87.
[46] Michael E. Wolf,et al. The cache performance and optimizations of blocked algorithms , 1991, ASPLOS IV.
[47] Larry Rudolph,et al. Dynamic decentralized cache schemes for mimd parallel processors , 1984, ISCA '84.
[48] Edith Schonberg,et al. An empirical comparison of monitoring algorithms for access anomaly detection , 2011, PPOPP '90.
[49] Anoop Gupta,et al. Two Techniques to Enhance the Performance of Memory Consistency Models , 1991, ICPP.
[50] Richard N. Taylor,et al. A general-purpose algorithm for analyzing concurrent programs , 1983, CACM.
[51] Anoop Gupta,et al. Hiding memory latency using dynamic scheduling in shared-memory multiprocessors , 1992, ISCA '92.
[52] Brian N. Bershad,et al. Midway : shared memory parallel programming with entry consistency for distributed memory multiprocessors , 1991 .
[53] Mark D. Hill,et al. Weak ordering—a new definition , 1998, ISCA '98.
[54] Mats Brorsson,et al. An adaptive cache coherence protocol optimized for migratory sharing , 1993, ISCA '93.
[55] Ken Kennedy,et al. Parallel program debugging with on-the-fly anomaly detection , 1990, Proceedings SUPERCOMPUTING '90.
[56] Anoop Gupta,et al. The directory-based cache coherence protocol for the DASH multiprocessor , 1990, ISCA '90.
[57] Christos H. Papadimitriou,et al. The Theory of Database Concurrency Control , 1986 .
[58] Bob Beck,et al. Shared-memory parallel programming in C++ , 1990, IEEE Software.
[59] Kourosh Gharachorloo,et al. Proving sequential consistency of high-performance shared memories (extended abstract) , 1991, SPAA '91.
[60] Alan L. Cox,et al. Evaluation of release consistent software distributed shared memory on emerging network technology , 1993, ISCA '93.
[61] Michel Dubois,et al. Delayed consistency and its effects on the miss rate of parallel programs , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).
[62] Andrew R. Pleszkun,et al. Implementation of precise interrupts in pipelined processors , 1985, ISCA '98.
[63] Anoop Gupta,et al. SPLASH: Stanford parallel applications for shared-memory , 1992, CARN.
[64] Niklaus Wirth,et al. Modula: A language for modular multiprogramming , 1977, Softw. Pract. Exp..
[65] Michel Dubois,et al. Memory Access Dependencies in Shared-Memory Multiprocessors , 1990, IEEE Trans. Software Eng..
[66] Per Brinch Hansen,et al. The Architecture of Concurrent Programs , 1977 .
[67] Robert H. B. Netzer,et al. Detecting data races on weak memory systems , 1991, [1991] Proceedings. The 18th Annual International Symposium on Computer Architecture.
[68] David B. Loveman. High performance Fortran , 1993, IEEE Parallel & Distributed Technology: Systems & Applications.
[69] Barton P. Miller,et al. On the Complexity of Event Ordering for Shared-Memory Parallel Program Executions , 1990, ICPP.
[70] R.K. Brayton,et al. Automatic verification of memory systems which service their requests out of order , 1995, Proceedings of ASP-DAC'95/CHDL'95/VLSI'95 with EDA Technofair.
[71] William W. Collier,et al. Reasoning about parallel architectures , 1992 .
[72] Michel Dubois,et al. Lockup-free Caches in High-Performance Multiprocessors , 1990, J. Parallel Distributed Comput..
[73] Ken Kennedy,et al. Compile-time detection of race conditions in a parallel program , 1989, ICS '89.
[74] John L. Hennessy,et al. Finding and Exploiting Parallelism in an Ocean Simulation Program: Experience, Results, and Implications , 1992, J. Parallel Distributed Comput..
[75] James H. Patterson,et al. Portable Programs for Parallel Processors , 1987 .
[76] Erik Hagersten,et al. Race-Free Interconnection Networks and Multiprocessor Consistency , 1991, ISCA.
[77] Brian N. Bershad,et al. PRESTO: A system for object‐oriented parallel programming , 1988, Softw. Pract. Exp..
[78] Brian N. Bershad,et al. The Midway distributed shared memory system , 1993, Digest of Papers. Compcon Spring.
[79] Joe D. Warren,et al. The program dependence graph and its use in optimization , 1984, TOPL.
[80] Anoop Gupta,et al. Programming for Different Memory Consistency Models , 1992, J. Parallel Distributed Comput..
[81] Michel Cekleov,et al. Formal Specification of Memory Models , 1992 .
[82] David W. Wall,et al. Global register allocation at link time , 1986, SIGPLAN '86.
[83] Katherine A. Yelick,et al. Optimizing Parallel SPMD Programs , 1994, LCPC.
[84] Kunle Olukotun,et al. Performance Optimization of Pipelined Primary Caches , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.
[85] Kevin P. McAuliffe,et al. RP3 Processor-Memory Element , 1985, ICPP.
[86] Susan J. Eggers,et al. On the validity of trace-driven simulation for multiprocessors , 1991, ISCA '91.
[87] Richard Noah Zucker,et al. Relaxed consistency and synchronization in parallel processors , 1992 .
[88] Anoop Gupta,et al. Performance evaluation of memory consistency models for shared-memory multiprocessors , 1991, ASPLOS IV.
[89] Jr. Richard Thomas Simoni,et al. Cache coherence directories for scalable multiprocessors , 1992 .
[90] Monica S. Lam,et al. Jade: a high-level, machine-independent language for parallel programming , 1993, Computer.
[91] Anoop Gupta,et al. Memory consistency and event ordering in scalable shared-memory multiprocessors , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.
[92] Robert J. Fowler,et al. Adaptive cache coherency for detecting migratory shared data , 1993, ISCA '93.
[93] Michel Dubois,et al. Concurrent Miss Resolution in Multiprocessor Caches , 1988, ICPP.
[94] A. Gupta,et al. The Stanford FLASH multiprocessor , 1994, Proceedings of 21 International Symposium on Computer Architecture.
[95] Helen Davis,et al. Tango introduction and tutorial , 1990 .
[96] Barton P. Miller,et al. Detecting Data Races in Parallel Program Executions , 1989 .
[97] Maurice Herlihy,et al. Linearizability: a correctness condition for concurrent objects , 1990, TOPL.
[98] N. Jouppi,et al. Complexity/performance tradeoffs with non-blocking loads , 1994, Proceedings of 21 International Symposium on Computer Architecture.
[99] Dennis Shasha,et al. Efficient and correct execution of parallel programs that share memory , 1988, TOPL.
[100] Mark D. Hill,et al. Implementing Sequential Consistency in Cache-Based Systems , 1990, ICPP.
[101] Anant Agarwal,et al. LimitLESS directories: A scalable cache coherence scheme , 1991, ASPLOS IV.
[102] Butler W. Lampson,et al. Experience with processes and monitors in Mesa , 1980, CACM.
[103] Josep Torrellas,et al. Estimating the Performance Advantages of Relaxing Consistency in a Shared Memory Multiprocessor , 1990, ICPP.
[104] Werner Buchholz,et al. Planning a Computer System: Project Stretch , 1962 .
[105] Richard L. Sites,et al. Alpha AXP architecture reference manual , 1995 .
[106] Alan Jay Smith,et al. Branch Prediction Strategies and Branch Target Buffer Design , 1995, Computer.
[107] Todd C. Mowry,et al. Tolerating latency through software-controlled data prefetching , 1994 .
[108] Anoop Gupta,et al. Techniques for improving the performance of sparse matrix factorization on multiprocessor workstations , 1990, Proceedings SUPERCOMPUTING '90.
[109] Arthur J. Bernstein,et al. Analysis of Programs for Parallel Processing , 1966, IEEE Trans. Electron. Comput..
[110] Willy Zwaenepoel,et al. Implementation and performance of Munin , 1991, SOSP '91.
[111] Gregory R. Andrews,et al. Concurrent programming - principles and practice , 1991 .
[112] David Callahan,et al. A future-based parallel language for a general-purpose highly-parallel computer , 1990 .
[113] James P. Laudon,et al. Architectural and Implementation Tradeoffs for Multiple-Context Processors , 1995 .
[114] Jean-Loup Baer,et al. A Performance Study of Memory Consistency Models , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.
[115] Mark D. Hill,et al. Sufficient Conditions for Implementing theData-Race-Free-1 Memory Model, * , 1992 .