Memory consistency models for shared-memory multiprocessors
暂无分享,去创建一个
[1] Jong-Deok Choi,et al. An efficient cache-based access anomaly detection scheme , 1991, ASPLOS IV.
[2] Monica S. Lam,et al. Jade: a high-level, machine-independent language for parallel programming , 1993, Computer.
[3] Samuel P. Midkiff,et al. Compiling programs with user parallelism , 1990 .
[4] Trevor Mudge,et al. Performance optimization of pipelined primary cache , 1992, ISCA '92.
[5] Alfred V. Aho,et al. Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.
[6] Roy Friedman,et al. Shared memory consistency conditions for non-sequential execution: definitions and programming strategies , 1993, SPAA '93.
[7] David A. Patterson,et al. Computer Architecture: A Quantitative Approach , 1969 .
[8] Susan J. Eggers,et al. On the validity of trace-driven simulation for multiprocessors , 1991, ISCA '91.
[9] Dennis Shasha,et al. Efficient and correct execution of parallel programs that share memory , 1988, TOPL.
[10] Mark D. Hill,et al. Implementing Sequential Consistency in Cache-Based Systems , 1990, ICPP.
[11] Robert J. Fowler,et al. Adaptive cache coherency for detecting migratory shared data , 1993, ISCA '93.
[12] Richard P. LaRowe,et al. Hiding Shared Memory Reference Latency on the Galactica Net Distributed Shared Memory Architecture , 1992, J. Parallel Distributed Comput..
[13] David Padua,et al. Debugging Fortran on a shared memory machine , 1987 .
[14] Michael L. Scott,et al. Algorithms for scalable synchronization on shared-memory multiprocessors , 1991, TOCS.
[15] Mark D. Hill,et al. A Unified Formalization of Four Shared-Memory Models , 1993, IEEE Trans. Parallel Distributed Syst..
[16] Sarita V. Adve,et al. Designing memory consistency models for shared-memory multiprocessors , 1993 .
[17] Anoop Gupta,et al. The directory-based cache coherence protocol for the DASH multiprocessor , 1990, ISCA '90.
[18] Anant Agarwal,et al. APRIL: a processor architecture for multiprocessing , 1990, ISCA '90.
[19] Niklaus Wirth,et al. Modula: A language for modular multiprogramming , 1977, Softw. Pract. Exp..
[20] Per Brinch Hansen,et al. The Architecture of Concurrent Programs , 1977 .
[21] Silvio Turrini,et al. Optimal group distribution in carry-skip adders , 1989, Proceedings of 9th Symposium on Computer Arithmetic.
[22] Butler W. Lampson,et al. Experience with processes and monitors in Mesa , 1980, CACM.
[23] Alan L. Cox,et al. Lazy release consistency for software distributed shared memory , 1992, ISCA '92.
[24] Anoop Gupta,et al. Two Techniques to Enhance the Performance of Memory Consistency Models , 1991, ICPP.
[25] Willy Zwaenepoel,et al. Implementation and performance of Munin , 1991, SOSP '91.
[26] W. R. Hamburgen,et al. Precise robotic paste dot dispensing , 1989, Proceedings., 39th Electronic Components Conference.
[27] Gregory R. Andrews,et al. Concurrent programming - principles and practice , 1991 .
[28] John L. Hennessy,et al. Finding and Exploiting Parallelism in an Ocean Simulation Program: Experience, Results, and Implications , 1992, J. Parallel Distributed Comput..
[29] James H. Patterson,et al. Portable Programs for Parallel Processors , 1987 .
[30] Mats Brorsson,et al. An adaptive cache coherence protocol optimized for migratory sharing , 1993, ISCA '93.
[31] Joe D. Warren,et al. The program dependence graph and its use in optimization , 1987, TOPL.
[32] Ken Kennedy,et al. Parallel program debugging with on-the-fly anomaly detection , 1990, Proceedings SUPERCOMPUTING '90.
[33] Brian N. Bershad,et al. PRESTO: A system for object‐oriented parallel programming , 1988, Softw. Pract. Exp..
[34] Brian N. Bershad,et al. The Midway distributed shared memory system , 1993, Digest of Papers. Compcon Spring.
[35] Brian N. Bershad,et al. Midway : shared memory parallel programming with entry consistency for distributed memory multiprocessors , 1991 .
[36] Alan L. Cox,et al. Evaluation of release consistent software distributed shared memory on emerging network technology , 1993, ISCA '93.
[37] J. Mcdonald,et al. Vectorization of a particle simulation method for hypersonic rarefied flow , 1988 .
[38] Leslie Lamport,et al. How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs , 2016, IEEE Transactions on Computers.
[39] Katherine A. Yelick,et al. Optimizing parallel programs with explicit synchronization , 1995, PLDI '95.
[40] James R. Goodman,et al. Cache Consistency and Sequential Consistency , 1991 .
[41] Bob Beck,et al. Shared-memory parallel programming in C++ , 1990, IEEE Software.
[42] Andrew R. Pleszkun,et al. Implementation of precise interrupts in pipelined processors , 1985, ISCA '98.
[43] David B. Loveman. High performance Fortran , 1993, IEEE Parallel & Distributed Technology: Systems & Applications.
[44] Barton P. Miller,et al. On the Complexity of Event Ordering for Shared-Memory Parallel Program Executions , 1990, ICPP.
[45] William W. Collier,et al. Reasoning about parallel architectures , 1992 .
[46] Larry Rudolph,et al. Dynamic decentralized cache schemes for mimd parallel processors , 1984, ISCA 1984.
[47] David W. Wall,et al. Systems for Late Code Modification , 1991, Code Generation.
[48] Barton P. Miller,et al. Detecting data races on weak memory systems , 1991, ISCA '91.
[49] Michel Dubois,et al. Concurrent Miss Resolution in Multiprocessor Caches , 1988, ICPP.
[50] Yale Patt,et al. Exploiting fine-grained parallelism through a combination of hardware and software techniques , 1991, ISCA '91.
[51] A. Gupta,et al. Parallel distributed-time logic simulation , 1989, IEEE Design & Test of Computers.
[52] William M. Johnson,et al. Super-scalar processor design , 1989 .
[53] Cathy May,et al. The PowerPC Architecture: A Specification for a New Family of RISC Processors , 1994 .
[54] Anant Agarwal,et al. Closing the window of vulnerability in multiphase memory transactions , 1992, ASPLOS V.
[55] Daniel E. Lenoski,et al. The design and analysis of DASH: a scalable directory-based multiprocessor , 1992 .
[56] Anoop Gupta,et al. Programming for Different Memory Consistency Models , 1992, J. Parallel Distributed Comput..
[57] Michel Cekleov,et al. Formal Specification of Memory Models , 1992 .
[58] Yehuda Afek,et al. A lazy cache algorithm , 1989, SPAA '89.
[59] Ken Kennedy,et al. Compile-time detection of race conditions in a parallel program , 1989, ICS '89.
[60] Maurice Herlihy,et al. Linearizability: a correctness condition for concurrent objects , 1990, TOPL.
[61] Michael D. Smith,et al. Boosting beyond static scheduling in a superscalar processor , 1990, ISCA '90.
[62] Barton P. Miller,et al. Improving the accuracy of data race detection , 1991, PPOPP '91.
[63] Michel Dubois,et al. Access ordering and coherence in shared memory multiprocessors , 1989 .
[64] Jeffrey C. Mogul. Observing TCP dynamics in real networks , 1992, SIGCOMM 1992.
[65] Alan Jay Smith,et al. Branch Prediction Strategies and Branch Target Buffer Design , 1995, Computer.
[66] Jonathan Rose. LocusRoute: a parallel global router for standard cells , 1988, 25th ACM/IEEE, Design Automation Conference.Proceedings 1988..
[67] Todd C. Mowry,et al. Tolerating latency through software-controlled data prefetching , 1994 .
[68] Arthur J. Bernstein,et al. Analysis of Programs for Parallel Processing , 1966, IEEE Trans. Electron. Comput..
[69] Anoop Gupta,et al. Tolerating Latency Through Software-Controlled Prefetching in Shared-Memory Multiprocessors , 1991, J. Parallel Distributed Comput..
[70] Mike Johnson,et al. Superscalar microprocessor design , 1991, Prentice Hall series in innovative technology.
[71] Anoop Gupta,et al. The Stanford FLASH Multiprocessor , 1994, ISCA.
[72] Norman P. Jouppi,et al. Complexity/performance tradeoffs with non-blocking loads , 1994, ISCA '94.
[73] Paul Hudak,et al. Memory coherence in shared virtual memory systems , 1989, TOCS.
[74] Anoop Gupta,et al. SPLASH: Stanford parallel applications for shared-memory , 1992, CARN.
[75] R.K. Brayton,et al. Automatic verification of memory systems which service their requests out of order , 1995, Proceedings of ASP-DAC'95/CHDL'95/VLSI'95 with EDA Technofair.
[76] Michel Dubois,et al. Lockup-free Caches in High-Performance Multiprocessors , 1990, J. Parallel Distributed Comput..
[77] James P. Laudon,et al. Architectural and Implementation Tradeoffs for Multiple-Context Processors , 1995 .
[78] Anoop Gupta,et al. Comparative evaluation of latency reducing and tolerating techniques , 1991, ISCA '91.
[79] Katherine A. Yelick,et al. Optimizing Parallel SPMD Programs , 1994, LCPC.
[80] Kevin P. McAuliffe,et al. RP3 Processor-Memory Element , 1985, ICPP.
[81] Russell Kao,et al. Piecewise Linear Models for Switch-Level Simulation , 1992 .
[82] Helen Davis,et al. Tango introduction and tutorial , 1990 .
[83] Barton P. Miller,et al. Detecting Data Races in Parallel Program Executions , 1989 .
[84] Michel Dubois,et al. Correct memory operation of cache-based multiprocessors , 1987, ISCA '87.
[85] Edith Schonberg,et al. An empirical comparison of monitoring algorithms for access anomaly detection , 2011, PPOPP '90.
[86] Richard N. Taylor,et al. A general-purpose algorithm for analyzing concurrent programs , 1983, CACM.
[87] Anoop Gupta,et al. Hiding memory latency using dynamic scheduling in shared-memory multiprocessors , 1992, ISCA '92.
[88] Christos H. Papadimitriou,et al. The Theory of Database Concurrency Control , 1986 .
[89] Michel Dubois,et al. Delayed consistency and its effects on the miss rate of parallel programs , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).
[90] Jean-Loup Baer,et al. A performance study of memory consistency models , 1992, ISCA '92.
[91] Stein Gjessing,et al. Distributed-directory scheme: scalable coherent interface , 1990, Computer.
[92] Michel Dubois,et al. Memory Access Dependencies in Shared-Memory Multiprocessors , 1990, IEEE Trans. Software Eng..
[93] Monica S. Lam,et al. The cache performance and optimizations of blocked algorithms , 1991, ASPLOS IV.
[94] Erik Hagersten,et al. Race-free interconnection networks and multiprocessor consistency , 1991, [1991] Proceedings. The 18th Annual International Symposium on Computer Architecture.
[95] Anant Agarwal,et al. LimitLESS directories: A scalable cache coherence scheme , 1991, ASPLOS IV.
[96] Robert M. Keller,et al. Look-Ahead Processors , 1975, CSUR.
[97] Josep Torrellas,et al. False Sharing ans Spatial Locality in Multiprocessor Caches , 1994, IEEE Trans. Computers.
[98] Josep Torrellas,et al. Estimating the Performance Advantages of Relaxing Consistency in a Shared Memory Multiprocessor , 1990, ICPP.
[99] Werner Buchholz,et al. Planning a Computer System: Project Stretch , 1962 .
[100] Yehuda Afek,et al. Lazy caching , 1993, TOPL.
[101] Richard L. Sites,et al. Alpha AXP architecture reference manual , 1995 .
[102] Kourosh Gharachorloo,et al. Detecting violations of sequential consistency , 1991, SPAA '91.
[103] David L. Dill,et al. An executable specification, analyzer and verifier for RMO (relaxed memory order) , 1995, SPAA '95.
[104] Anoop Gupta,et al. The Stanford Dash multiprocessor , 1992, Computer.
[105] Anoop Gupta,et al. Performance evaluation of memory consistency models for shared-memory multiprocessors , 1991, ASPLOS IV.
[106] Richard Noah Zucker,et al. Relaxed consistency and synchronization in parallel processors , 1992 .
[107] Jr. Richard Thomas Simoni,et al. Cache coherence directories for scalable multiprocessors , 1992 .