Productivity, portability, performance: data-centric Python
暂无分享,去创建一个
Torsten Hoefler | Timo Schneider | Tiziano De Matteis | Johannes de Fine Licht | Luca Lavarini | Tal Ben-Nun | Alexandru Calotoiu | Alexandros Nikolaos Ziogas
[1] K. Jarrod Millman,et al. Array programming with NumPy , 2020, Nat..
[2] Corporate Rice University,et al. High performance Fortran language specification , 1993, FORF.
[3] Vikram S. Adve,et al. LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..
[4] Torsten Hoefler,et al. Dawn: a High-level Domain-Specific Language Compiler Toolchain for Weather and Climate Applications , 2020, Supercomput. Front. Innov..
[5] Stefan Behnel,et al. Cython: The Best of Both Worlds , 2011, Computing in Science & Engineering.
[6] Torsten Hoefler,et al. Stateful dataflow multigraphs: a data-centric model for performance portability on heterogeneous architectures , 2019, SC.
[7] Lisandro Dalcin,et al. Parallel distributed computing using Python , 2011 .
[8] Nikoli Dryden,et al. Data Movement Is All You Need: A Case Study on Optimizing Transformers , 2020, MLSys.
[9] Michael I. Jordan,et al. Ray: A Distributed Framework for Emerging AI Applications , 2017, OSDI.
[10] Michael Garland,et al. Legate NumPy: accelerated and distributed array computing , 2019, SC.
[11] Alexandros Nikolaos Ziogas,et al. A data-centric approach to extreme-scale ab initio dissipative quantum transport simulations , 2019, SC.
[12] Zach DeVito,et al. Darkroom , 2014 .
[13] Tal Ben-Nun,et al. Workflows are the New Applications: Challenges in Performance, Portability, and Productivity , 2020, 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC).
[14] et al.,et al. Jupyter Notebooks - a publishing format for reproducible computational workflows , 2016, ELPUB.
[15] Lorena A. Barba,et al. CFD Python: the 12 steps to Navier-Stokes equations , 2018, Journal of Open Source Education.
[16] Daniel Sunderland,et al. Kokkos: Enabling manycore performance portability through polymorphic memory access patterns , 2014, J. Parallel Distributed Comput..
[17] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..
[18] Uday Bondhugula,et al. MLIR: Scaling Compiler Infrastructure for Domain Specific Computation , 2021, 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).
[19] Torsten Hoefler,et al. Domain-Specific Multi-Level IR Rewriting for GPU , 2020, ACM Trans. Archit. Code Optim..
[20] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Bradford L. Chamberlain,et al. Parallel Programmability and the Chapel Language , 2007, Int. J. High Perform. Comput. Appl..
[22] Michael Lange,et al. Devito: Towards a Generic Finite Difference DSL Using Symbolic Python , 2016, 2016 6th Workshop on Python for High-Performance and Scientific Computing (PyHPC).
[23] Torsten Hoefler,et al. Transformations of High-Level Synthesis Codes for High-Performance Computing , 2018, IEEE Transactions on Parallel and Distributed Systems.
[24] Torsten Hoefler,et al. StencilFlow: Mapping Large Stencil Programs to Distributed Spatial Computing Systems , 2020, ArXiv.
[25] M. Baldauf,et al. Operational Convective-Scale Numerical Weather Prediction with the COSMO Model: Description and Sensitivities , 2011 .
[26] Wes McKinney,et al. Data Structures for Statistical Computing in Python , 2010, SciPy.
[27] Torsten Hoefler,et al. Red-blue pebbling revisited: near optimal parallel matrix-matrix multiplication , 2019, SC.
[28] Troels Blum,et al. Bohrium: Unmodified NumPy Code on CPU, GPU, and Cluster , 2013 .
[29] L. Dagum,et al. OpenMP: an industry standard API for shared-memory programming , 1998 .
[30] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[31] Mathieu Luisier,et al. Ab-initio quantum transport simulation of self-heating in single-layer 2-D materials , 2017, 1812.01970.
[32] Dan Bonachea,et al. GASNet-EX: A High-Performance, Portable Communication Library for Exascale , 2018, LCPC.
[33] Frédo Durand,et al. Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines , 2013, PLDI 2013.
[34] Mehdi Amini,et al. Pythran: Enabling Static Optimization of Scientific Python Programs , 2013, SciPy.
[35] R. Nigel Horspool,et al. Simple Generation of Static Single-Assignment Form , 2000, CC.
[36] Alexander Aiken,et al. Legion: Expressing locality and independence with logical regions , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[37] Alexander Aiken,et al. Regent: a high-productivity programming language for HPC with logical regions , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.
[38] Jason Sewall,et al. Data Parallel C++: Enhancing SYCL Through Extensions for Productivity and Performance , 2020, IWOCL.
[39] Jérôme Kieffer,et al. PyFAI: a Python library for high performance azimuthal integration on GPU , 2013, Powder Diffraction.
[40] Siu Kwan Lam,et al. Numba: a LLVM-based Python JIT compiler , 2015, LLVM '15.
[41] Carlo A. Furia,et al. A Comparative Study of Programming Languages in Rosetta Code , 2014, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.
[42] Lawrence D. Jackel,et al. Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.
[43] David Hinkley,et al. Bootstrap Methods: Another Look at the Jackknife , 2008 .
[44] Robert Pincus,et al. The CLAW DSL: Abstractions for Performance Portable Weather and Climate Models , 2018, PASC.
[45] Torsten Hoefler,et al. Application-oriented ping-pong benchmarking: how to assess the real communication overheads , 2014, Computing.
[46] Torsten Hoefler,et al. Scientific Benchmarking of Parallel Computing Systems Twelve ways to tell the masses when reporting performance results , 2017 .