暂无分享,去创建一个
[1] Thomas F. Wenisch,et al. Mechanisms for store-wait-free multiprocessors , 2007, ISCA '07.
[2] Sen Hu,et al. Efficient system-enforced deterministic parallelism , 2010, OSDI.
[3] David A. Padua,et al. Advanced compiler optimizations for supercomputers , 1986, CACM.
[4] Kunle Olukotun,et al. Data speculation support for a chip multiprocessor , 1998, ASPLOS VIII.
[5] Sebastian Burckhardt,et al. Concurrent programming with revisions and isolation types , 2010, OOPSLA.
[6] Brian W. Barrett,et al. Introducing the Graph 500 , 2010 .
[7] Antonia Zhai,et al. The STAMPede approach to thread-level speculation , 2005, TOCS.
[8] Sebastian Burckhardt,et al. Two for the price of one: a model for parallel and incremental computation , 2011, OOPSLA '11.
[9] Keir Fraser,et al. Concurrent programming without locks , 2007, TOCS.
[10] Brandon Lucia,et al. Conflict exceptions: simplifying concurrent language semantics with precise hardware exceptions for data-races , 2010, ISCA.
[11] Michael F. Spear,et al. An integrated hardware-software approach to flexible transactional memory , 2007, ISCA '07.
[12] Dan Grossman,et al. CoreDet: a compiler and runtime system for deterministic multithreaded execution , 2010, ASPLOS XV.
[13] Swarnendu Biswas,et al. Valor: efficient, software-only region conflict exceptions , 2015, OOPSLA.
[14] Antonia Zhai,et al. A scalable approach to thread-level speculation , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[15] Bradley C. Kuszmaul,et al. Unbounded Transactional Memory , 2005, HPCA.
[16] Martin C. Rinard,et al. Commutativity analysis: a new analysis technique for parallelizing compilers , 1997, TOPL.
[17] Kevin Skadron,et al. Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[18] Kunle Olukotun,et al. Programming with transactional coherence and consistency (TCC) , 2004, ASPLOS XI.
[19] Daniel Sánchez,et al. Exploiting commutativity to reduce the cost of updates to shared data in cache-coherent systems , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[20] Yun Zhang,et al. Commutative set: a language extension for implicit parallel programming , 2011, PLDI '11.
[21] David A. Wood,et al. A Primer on Memory Consistency and Cache Coherence , 2012, Synthesis Lectures on Computer Architecture.
[22] Matteo Frigo,et al. Reducers and other Cilk++ hyperobjects , 2009, SPAA '09.
[23] E. Berger,et al. Grace: Safe and Efficient Concurrent Programming , 2008 .
[24] Sergey Brin,et al. The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.
[25] Milo M. K. Martin,et al. RETCON: transactional repair without replay , 2010, ISCA '10.
[26] Harish Patil,et al. Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.
[27] Zhiyuan Li,et al. General data structure expansion for multi-threading , 2013, PLDI.
[28] David A. Patterson,et al. The GAP Benchmark Suite , 2015, ArXiv.
[29] Henry Hoffmann,et al. Managing performance vs. accuracy trade-offs with loop perforation , 2011, ESEC/FSE '11.
[30] Satish Narayanasamy,et al. DRFX: a simple and efficient memory model for concurrent programming languages , 2010, PLDI '10.
[31] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[32] Norman P. Jouppi,et al. CACTI 6.0: A Tool to Model Large Caches , 2009 .
[33] Thomas F. Wenisch,et al. InvisiFence: performance-transparent memory ordering in conventional multiprocessors , 2009, ISCA '09.
[34] Josep Torrellas,et al. BulkSC: bulk enforcement of sequential consistency , 2007, ISCA '07.
[35] Christopher J. Hughes,et al. Performance evaluation of Intel® Transactional Synchronization Extensions for high-performance computing , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[36] David A. Padua,et al. Automatic Array Privatization , 1993, Compiler Optimizations for Scalable Parallel Systems Languages.
[37] Todd Mytkowicz,et al. Parallelizing user-defined aggregations using symbolic execution , 2015, SOSP.