Modular divide-and-conquer parallelization of nested loops
暂无分享,去创建一个
[1] David F. Bacon,et al. Compiler transformations for high-performance computing , 1994, CSUR.
[2] Michaël Rusinowitch,et al. Any ground associative-commutative theory has a finite canonical system , 1996, Journal of Automated Reasoning.
[3] Sergei Gorlatch,et al. Parallelizing functional programs by generalization , 1999 .
[4] Margaret Martonosi,et al. Characterizing and improving the performance of Intel Threading Building Blocks , 2008, 2008 IEEE International Symposium on Workload Characterization.
[5] Chuck Pheatt,et al. Intel® threading building blocks , 2008 .
[6] Sharad Malik,et al. Retargetable Very Long Instuction Word Compiler Framework for Digital Signal Processors. , 2002 .
[7] Yosi Ben-Asher,et al. Parallel Solutions of Simple Indexed Recurrence Equations , 2001, IEEE Trans. Parallel Distributed Syst..
[8] Azadeh Farzan,et al. Modular Synthesis of Divide-and-Conquer Parallelism for Nested Loops (Extended Version) , 2019, ArXiv.
[9] L. Dagum,et al. OpenMP: an industry standard API for shared-memory programming , 1998 .
[10] Cédric Bastoul,et al. Code generation in the polyhedral model is easier than you think , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..
[11] Rajeev Alur,et al. Syntax-guided synthesis , 2013, 2013 Formal Methods in Computer-Aided Design.
[12] Albert Cohen,et al. Polyhedral Code Generation in the Real World , 2006, CC.
[13] Ron Shamir,et al. Faster subtree isomorphism , 1997, Proceedings of the Fifth Israeli Symposium on Theory of Computing and Systems.
[14] Jan Gustafsson,et al. Automatic Derivation of Loop Bounds and Infeasible Paths for WCET Analysis Using Abstract Execution , 2006, 2006 27th IEEE International Real-Time Systems Symposium (RTSS'06).
[15] Guy E. Blelloch,et al. Internally deterministic parallel algorithms can be fast , 2012, PPoPP '12.
[16] Todd Mytkowicz,et al. Parallelizing user-defined aggregations using symbolic execution , 2015, SOSP.
[17] Sergei Gorlatch,et al. Systematic Extraction and Implementation of Divide-and-Conquer Parallelism , 1996, PLILP.
[18] K. Rustan M. Leino,et al. Dafny: An Automatic Program Verifier for Functional Correctness , 2010, LPAR.
[19] Azadeh Farzan,et al. Synthesis of divide and conquer parallelism for loops , 2017, PLDI.
[20] Priti Shankar,et al. The Compiler Design Handbook: Optimizations and Machine Code Generation , 2002, The Compiler Design Handbook.
[21] Alvin Cheung,et al. Verified lifting of stencil computations , 2016, PLDI.
[22] Akimasa Morihata,et al. Automatic inversion generates divide-and-conquer parallel programs , 2007, PLDI '07.
[23] Maaz Bin Safeer Ahmad,et al. Gradual synthesis for static parallelization of single-pass array-processing programs , 2017, PLDI.
[24] Kevin Skadron,et al. Scalable parallel programming , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).
[25] Guy E. Blelloch,et al. Prefix sums and their applications , 1990 .
[26] Chau-Wen Tseng,et al. A comparison of parallelization techniques for irregular reductions , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.
[27] Yunheung Paek,et al. Parallel Programming with Polaris , 1996, Computer.
[28] Sergei Gorlatch,et al. Extracting and Implementing List Homomorphisms in Parallel Program Development , 1999, Sci. Comput. Program..
[29] Manu Sridharan,et al. Translating imperative code to MapReduce , 2014, OOPSLA 2014.
[30] W. Daniel Hillis,et al. Data parallel algorithms , 1986, CACM.
[31] Mahmut T. Kandemir,et al. Compilation for distributed memory architectures , 2002 .
[32] Claude Marché,et al. Termination of Associative-Commutative Rewriting by Dependency Pairs , 1998, RTA.
[33] Jeremy Gibbons. The Third Homomorphism Theorem , 1996, J. Funct. Program..
[34] Akimasa Morihata,et al. Automatic Parallelization of Recursive Functions Using Quantifier Elimination , 2010, FLOPS.
[35] Armando Solar-Lezama,et al. Deriving divide-and-conquer dynamic programming algorithms using solver-aided transformations , 2016, OOPSLA.
[36] Allan L. Fisher,et al. Parallelizing complex scans and reductions , 1994, PLDI '94.
[37] Daniel Cordes,et al. A Fast and Precise Static Loop Analysis Based on Abstract Interpretation, Program Slicing and Polytope Models , 2009, 2009 International Symposium on Code Generation and Optimization.
[38] Ute Schmid,et al. Inductive Synthesis of Functional Programs: An Explanation Based Generalization Approach , 2006, J. Mach. Learn. Res..
[39] Aws Albarghouthi,et al. MapReduce program synthesis , 2016, PLDI.
[40] Keshav Pingali,et al. The tao of parallelism in algorithms , 2011, PLDI '11.