Semantic Language Extensions for Implicit Parallel Programming
暂无分享,去创建一个
[1] Nathan Clark,et al. Commutativity analysis for software parallelization: letting program transformations see the big picture , 2009, ASPLOS.
[2] Rohit Chandra,et al. Parallel programming in openMP , 2000 .
[3] Santosh Pande,et al. Efficiently speeding up sequential computation through the n-way programming model , 2011, OOPSLA '11.
[4] Chen Ji,et al. A 14.6 billion degrees of freedom, 5 teraflops, 2.5 terabyte earthquake simulation on the Earth Simulator , 2003, ACM/IEEE SC 2003 Conference (SC'03).
[5] Sharon C. Glotzer,et al. HOOMD-blue, general-purpose many-body dynamics on the GPU , 2010 .
[6] Kunle Olukotun,et al. The OpenTM Transactional Application Programming Interface , 2007, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).
[7] Ken Kennedy,et al. The D Editor: a new interactive parallel programming tool , 1994, Proceedings of Supercomputing '94.
[8] S. I. Feldman,et al. A Fortran to C converter , 1990, FORF.
[9] Maurice Herlihy,et al. Transactional boosting: a methodology for highly-concurrent transactional objects , 2008, PPoPP.
[10] Chau-Wen Tseng,et al. Improving compiler and run-time support for adaptive irregular codes , 1998, Proceedings. 1998 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.98EX192).
[11] Maurice Herlihy,et al. Coarse-grained transactions , 2010, POPL '10.
[12] Niklaus Wirth,et al. A Plea for Lean Software , 1995, Computer.
[13] Sarita V. Adve,et al. Shared Memory Consistency Models: A Tutorial , 1996, Computer.
[14] Curtis R. Cook,et al. Are expectations for parallelism too high? a survey of potential parallel users , 1994, Proceedings of Supercomputing '94.
[15] Alejandro Duran,et al. The Design of OpenMP Tasks , 2009, IEEE Transactions on Parallel and Distributed Systems.
[16] Easwaran Raman,et al. Speculative Decoupled Software Pipelining , 2007, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).
[17] Martin C. Carlisle,et al. Olden: parallelizing programs with dynamic data structures on distributed-memory machines , 1996 .
[18] Anthony Skjellum,et al. Using MPI - portable parallel programming with the message-parsing interface , 1994 .
[19] Michael F. P. O'Boyle,et al. Towards a holistic approach to auto-parallelization: integrating profile-driven parallelism detection and machine-learning based mapping , 2009, PLDI '09.
[20] Jeffrey Overbey,et al. A type and effect system for deterministic parallel Java , 2009, OOPSLA 2009.
[21] David A. Padua,et al. Experience in the Automatic Parallelization of Four Perfect-Benchmark Programs , 1991, LCPC.
[22] Michael F. P. O'Boyle,et al. Partitioning streaming parallelism for multi-cores: A machine learning based approach , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).
[23] Saturnino Garcia,et al. Kremlin: rethinking and rebooting gprof for the multicore age , 2011, PLDI '11.
[24] Todd C. Mowry,et al. The potential for using thread-level data speculation to facilitate automatic parallelization , 1998, Proceedings 1998 Fourth International Symposium on High-Performance Computer Architecture.
[25] D. E. Stevenson,et al. Science, computational science, and computer science: at a crossroads , 1994, CACM.
[26] Jonathan Eastep,et al. Smart data structures: an online machine learning approach to multicore data structures , 2011, ICAC '11.
[27] Sean R. Eddy,et al. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .
[28] Martin C. Rinard,et al. Verification of semantic commutativity conditions and inverse operations on linked data structures , 2011, PLDI '11.
[29] Assaf J. Kfoury,et al. Formal semantics of weak references , 2005, ISMM '06.
[30] Matteo Frigo,et al. The implementation of the Cilk-5 multithreaded language , 1998, PLDI.
[31] Satnam Singh,et al. Feedback directed implicit parallelism , 2007, ICFP '07.
[32] Yun Zhang,et al. Commutative set: a language extension for implicit parallel programming , 2011, PLDI '11.
[33] Ron Cytron,et al. Doacross: Beyond Vectorization for Multiprocessors , 1986, ICPP.
[34] Zhiyuan Li,et al. ASYNC Loop Constructs for Relaxed Synchronization , 2008, LCPC.
[35] Michel Juillard,et al. Dynare: a program for the resolution and simulation of dynamic models with forward variables through the use of a relaxation algorithm , 1996 .
[36] Woongki Baek,et al. Green: a framework for supporting energy-conscious programming using controlled approximation , 2010, PLDI '10.
[37] Weixiong Zhang,et al. Phase Transitions and Backbones of 3-SAT and Maximum 3-SAT , 2001, CP.
[38] Alan Edelman,et al. PetaBricks: a language and compiler for algorithmic choice , 2009, PLDI '09.
[39] Jan Smans,et al. Deadlock-Free Channels and Locks , 2010, ESOP.
[40] Stephen John Turner,et al. Tulipse: A Visualization Framework for User-Guided Parallelization , 2012, Euro-Par.
[41] Keshav Pingali,et al. How much parallelism is there in irregular applications? , 2009, PPoPP '09.
[42] Martin C. Rinard,et al. Commutativity analysis: a new analysis framework for parallelizing compilers , 1996, PLDI '96.
[43] David A. Padua,et al. Beyond Arrays - A Container-Centric Approach for Parallelization of Real-World Symbolic Applications , 1998, LCPC.
[44] Martin C. Rinard,et al. Eliminating synchronization bottlenecks using adaptive replication , 2003, TOPL.
[45] Swarat Chaudhuri,et al. Parallel programming with object assemblies , 2009, OOPSLA 2009.
[46] Arvind,et al. Implicit parallel programming in pH , 2001 .
[47] Yen-Kuang Chen,et al. The ALPBench benchmark suite for complex multimedia applications , 2005, IEEE International. 2005 Proceedings of the IEEE Workload Characterization Symposium, 2005..
[48] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[49] Ken Kennedy,et al. Optimizing Compilers for Modern Architectures: A Dependence-based Approach , 2001 .
[50] George C. Necula,et al. Specifying and checking semantic atomicity for multithreaded programs , 2011, ASPLOS XVI.
[51] Easwaran Raman,et al. Spice: speculative parallel iteration chunk execution , 2008, CGO '08.
[52] Emery D. Berger,et al. Grace: safe multithreaded programming for C/C++ , 2009, OOPSLA 2009.
[53] Trevor Mudge,et al. MiBench: A free, commercially representative embedded benchmark suite , 2001 .
[54] Guilherme Ottoni,et al. Global instruction scheduling for multi-threaded architectures , 2008 .
[55] Kunle Olukotun,et al. The Atomos transactional programming language , 2006, PLDI '06.
[56] David S. Bolme,et al. FacePerf: Benchmarks for Face Recognition Algorithms , 2007, 2007 IEEE 10th International Symposium on Workload Characterization.
[57] Peter Sewell,et al. Clarifying and compiling C/C++ concurrency: from C++11 to POWER , 2012, POPL '12.
[58] Vikram S. Adve,et al. LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..
[59] Paulo F. Flores,et al. PMSat: a parallel version of MiniSAT , 2008, J. Satisf. Boolean Model. Comput..
[60] Hideya Iwasaki,et al. Automatic parallelization via matrix multiplication , 2011, PLDI '11.
[61] Berkin Özisikyilmaz,et al. MineBench: A Benchmark Suite for Data Mining Workloads , 2006, 2006 IEEE International Symposium on Workload Characterization.
[62] Olga G. Troyanskaya,et al. The Sleipnir library for computational functional genomics , 2008, Bioinform..
[63] Serdar Tasiran,et al. An annotation assistant for interactive debugging of programs with common synchronization idioms , 2009, PADTAD '09.
[64] Yun Zhang,et al. Revisiting the Sequential Programming Model for Multi-Core , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[65] Jesús Labarta,et al. A Simulation of Seismic Wave Propagation at High Resolution in the Inner Core of the Earth on 2166 Processors of MareNostrum , 2008, VECPAR.
[66] Larry Smarr,et al. Supercomputing and the transformation of science , 1993 .
[67] D. Lettenmaier,et al. A simple hydrologically based model of land surface water and energy fluxes for general circulation models , 1994 .
[68] Scott A. Mahlke,et al. Parallelizing sequential applications on commodity hardware using a low-cost software transactional memory , 2009, PLDI '09.
[69] Joshua S. Auerbach,et al. Lime: a Java-compatible and synthesizable language for heterogeneous architectures , 2010, OOPSLA.
[70] Engin Ipek,et al. Coordinated management of multiple interacting resources in chip multiprocessors: A machine learning approach , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.
[71] Ayal Zaks,et al. Fast condensation of the program dependence graph , 2013, PLDI.
[72] Brian Demsky,et al. OoOJava: an out-of-order approach to parallel programming , 2010 .
[73] Steven G. Johnson,et al. The Design and Implementation of FFTW3 , 2005, Proceedings of the IEEE.
[74] Chen Ding,et al. Software behavior oriented parallelization , 2007, PLDI '07.
[75] Vahid Tabatabaee,et al. Parallel Parameter Tuning for Applications with Performance Variability , 2005, ACM/IEEE SC 2005 Conference (SC'05).
[76] Martyn Plummer,et al. JAGS: Just Another Gibbs Sampler , 2012 .
[77] Jeremy Kepner,et al. The HPEC Challenge Benchmark Suite , 2006 .
[78] Christoforos E. Kozyrakis,et al. Evaluating MapReduce for Multi-core and Multiprocessor Systems , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.
[79] Wendong Hu,et al. NetBench: a benchmarking suite for network processors , 2001, IEEE/ACM International Conference on Computer Aided Design. ICCAD 2001. IEEE/ACM Digest of Technical Papers (Cat. No.01CH37281).
[80] Marco Dorigo,et al. The hyper-cube framework for ant colony optimization , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[81] Hsien-Hsin S. Lee,et al. Kicking the tires of software transactional memory: why the going gets tough , 2008, SPAA '08.
[82] Ken Kennedy,et al. Interactive Parallel Programming using the ParaScope Editor , 1991, IEEE Trans. Parallel Distributed Syst..
[83] Saturnino Garcia,et al. Kismet: parallel speedup estimates for serial programs , 2011, OOPSLA '11.
[84] Alan Sussman,et al. AARTS: low overhead online adaptive auto-tuning , 2011, EXADAPT '11.
[85] Keshav Pingali,et al. Optimistic parallelism requires abstractions , 2007, PLDI '07.
[86] R. E. Kurt Stirewalt,et al. Incremental dependence analysis for interactive parallelization , 1990, ICS '90.
[87] Perry R. Cook,et al. ChucK: A Concurrent, On-the-fly, Audio Programming Language , 2003, ICMC.
[88] Martin Rinard,et al. Reasoning about Relaxed Programs , 2011 .
[89] Peter G. Harrison,et al. Parallel Programming Using Skeleton Functions , 1993, PARLE.
[90] Joe D. Warren,et al. The program dependence graph and its use in optimization , 1987, TOPL.
[91] Vivek Sarkar,et al. X10: an object-oriented approach to non-uniform cluster computing , 2005, OOPSLA '05.
[92] Alan Mycroft,et al. A lightweight in-place implementation for software thread-level speculation , 2009, SPAA '09.
[93] G.E. Moore,et al. Cramming More Components Onto Integrated Circuits , 1998, Proceedings of the IEEE.
[94] Brian Ensink,et al. Language and Compiler Support for Adaptive Distributed Applications , 2001 .
[95] Easwaran Raman,et al. Parallel-stage decoupled software pipelining , 2008, CGO '08.
[96] Scott A. Mahlke,et al. Uncovering hidden loop level parallelism in sequential applications , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.
[97] Lixia Liu,et al. Improving parallelism and locality with asynchronous algorithms , 2010, PPoPP '10.
[98] Ranjit Jhala,et al. Deterministic parallelism via liquid effects , 2012, PLDI '12.
[99] David R. Butenhof. Programming with POSIX threads , 1993 .
[100] Suresh Jagannathan,et al. Safe futures for Java , 2005, OOPSLA '05.
[101] John H. Reppy. Concurrent ML: Design, Application and Semantics , 1993, Functional Programming, Concurrency, Simulation and Automated Reasoning.
[102] Dan Grossman,et al. Type-safe multithreading in cyclone , 2003, TLDI '03.
[103] Amer Diwan,et al. SUIF Explorer: an interactive and interprocedural parallelizer , 1999, PPoPP '99.
[104] Don Coppersmith,et al. The Data Encryption Standard (DES) and its strength against attacks , 1994, IBM J. Res. Dev..
[105] Teresa H. Y. Meng,et al. Merge: a programming model for heterogeneous multi-core systems , 2008, ASPLOS.
[106] Guilherme Ottoni,et al. Automatic thread extraction with decoupled software pipelining , 2005, 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05).
[107] Martin Rinard,et al. The design, implementation and evaluation of Jade: a portable, implicitly parallel programming language , 1994 .
[108] Luis Ceze,et al. Implicit parallelism with ordered transactions , 2007, PPoPP.
[109] Lakhdar Sais,et al. ManySAT: a Parallel SAT Solver , 2009, J. Satisf. Boolean Model. Comput..
[110] Stefano de Gironcoli,et al. QUANTUM ESPRESSO: a modular and open-source software project for quantum simulations of materials , 2009, Journal of physics. Condensed matter : an Institute of Physics journal.
[111] Kunle Olukotun,et al. Transactional collection classes , 2007, PPOPP.
[112] Eric C. R. Hehner,et al. A Practical Theory of Programming , 1993, Texts and Monographs in Computer Science.
[113] Simon L. Peyton Jones,et al. Data parallel Haskell: a status report , 2007, DAMP '07.
[114] Insung Park,et al. Parallel programming environment for OpenMP , 2001, Sci. Program..
[115] Serge J. Belongie,et al. SD-VBS: The San Diego Vision Benchmark Suite , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[116] Antonia Zhai,et al. A scalable approach to thread-level speculation , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[117] Yale N. Patt,et al. Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[118] Michiel J. L. de Hoon,et al. Bioinformatics and Computational Biology with Biopython , 2003 .
[119] Marc Snir,et al. GETTING UP TO SPEED THE FUTURE OF SUPERCOMPUTING , 2004 .
[120] Niklas Sörensson,et al. An Extensible SAT-solver , 2003, SAT.
[121] Satoshi Matsuoka,et al. Physis: An implicitly parallel programming model for stencil computations on large-scale GPU-accelerated supercomputers , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[122] John L. Henning. SPEC CPU2006 benchmark descriptions , 2006, CARN.
[123] Steve Plimpton,et al. Fast parallel algorithms for short-range molecular dynamics , 1993 .
[124] Giorgios Kollias,et al. Asynchronous Iterative Algorithms , 2011, Encyclopedia of Parallel Computing.
[125] William E. Weihl,et al. Commutativity-based concurrency control for abstract data types , 1988, [1988] Proceedings of the Twenty-First Annual Hawaii International Conference on System Sciences. Volume II: Software track.
[126] David A. Wood,et al. ASR: Adaptive Selective Replication for CMP Caches , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[127] Matthew J. Bridges,et al. The velocity compiler: extracting efficient multicore execution from legacy sequential codes , 2008 .
[128] William Thies,et al. StreamIt: A Language for Streaming Applications , 2002, CC.
[129] Sebastian Burckhardt,et al. The design of a task parallel library , 2009, OOPSLA 2009.
[130] Laxmikant V. Kalé,et al. CHARM++: a portable concurrent object oriented system based on C++ , 1993, OOPSLA '93.
[131] Mark Weiser,et al. Program Slicing , 1981, IEEE Transactions on Software Engineering.
[132] Samuel Williams,et al. The Landscape of Parallel Computing Research: A View from Berkeley , 2006 .
[133] Abhishek Udupa,et al. ALTER: exploiting breakable dependences for parallelization , 2011, PLDI '11.
[134] Ana Sokolova,et al. Scalability versus semantics of concurrent FIFO queues , 2011, PODC '11.
[135] Keshav Pingali,et al. Exploiting the commutativity lattice , 2011, PLDI '11.
[136] Cleve B. Moler,et al. Numerical computing with MATLAB , 2004 .
[137] Danny Dig. A Refactoring Approach to Parallelism , 2011, IEEE Software.
[138] Serdar Tasiran,et al. A calculus of atomic actions , 2009, POPL '09.
[139] Matteo Frigo,et al. Reducers and other Cilk++ hyperobjects , 2009, SPAA '09.
[140] Josep Torrellas,et al. Speculative synchronization: applying thread-level speculation to explicitly parallel applications , 2002, ASPLOS X.
[141] Thorsten Joachims,et al. Making large scale SVM learning practical , 1998 .
[142] Lei Liu,et al. Safe parallel programming using dynamic dependence hints , 2011, OOPSLA '11.
[143] Kunle Olukotun,et al. STAMP: Stanford Transactional Applications for Multi-Processing , 2008, 2008 IEEE International Symposium on Workload Characterization.
[144] Lawrence Rauchwerger,et al. The LRPD test: speculative run-time parallelization of loops with privatization and reduction parallelization , 1995, PLDI '95.
[145] Patrick Th. Eugster,et al. Ribbons: a partially shared memory programming model , 2011, OOPSLA '11.
[146] G. Ramalingam,et al. Safe programmable speculative parallelism , 2010, PLDI '10.
[147] Sanjay J. Patel,et al. Implicitly Parallel Programming Models for Thousand-Core Microprocessors , 2007, 2007 44th ACM/IEEE Design Automation Conference.
[148] James Reinders,et al. Intel® threading building blocks , 2008 .
[149] Adam Welc,et al. Design and implementation of transactional constructs for C/C++ , 2008, OOPSLA '08.
[150] Joel H. Saltz,et al. Run-time and compile-time support for adaptive irregular problems , 1994, Proceedings of Supercomputing '94.
[151] Ayal Zaks,et al. Speculative separation for privatization and reductions , 2012, PLDI.
[152] Mendel Rosenblum,et al. Streamware: programming general-purpose multicore processors using streams , 2008, ASPLOS.
[153] Cherri M. Pancake,et al. What users need in parallel tool support: survey results and analysis , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.
[154] Kevin Skadron,et al. Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).