TiDA: High-Level Programming Abstractions for Data Locality Management
暂无分享,去创建一个
John Shalf | George Michelogiannakis | Ann S. Almgren | Tan Nguyen | Muhammed Nufail Farooqi | Didem Unat | Weiqun Zhang | Burak Bastem | J. Shalf | Weiqun Zhang | A. Almgren | T. Nguyen | George Michelogiannakis | D. Unat | Burak Bastem
[1] J. B. Bell,et al. High-order algorithms for compressible reacting flow with complex chemistry , 2013, 1309.7327.
[2] Sanjay V. Rajopadhye,et al. Multi-level tiling: M for the price of one , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).
[3] Zhigang Mao,et al. An application specific NoC mapping for optimized delay , 2006, International Conference on Design and Test of Integrated Systems in Nanoscale Technology, 2006. DTIS 2006..
[4] John Shalf,et al. Programming Abstractions for Data Locality , 2014 .
[5] John Shalf,et al. BoxLib with Tiling: An Adaptive Mesh Refinement Software Framework , 2016, SIAM J. Sci. Comput..
[6] William J. Dally,et al. Design tradeoffs for tiled CMP on-chip networks , 2006, ICS '06.
[7] Uday Bondhugula,et al. A practical automatic polyhedral parallelizer and locality optimizer , 2008, PLDI '08.
[8] Radu Marculescu,et al. Energy-aware mapping for tile-based NoC architectures under performance constraints , 2003, ASP-DAC '03.
[9] Sriram Krishnamoorthy,et al. Parametric multi-level tiling of imperfectly nested loops , 2009, ICS.
[10] Davide Bertozzi,et al. Supporting Task Migration in Multi-Processor Systems-on-Chip: A Feasibility Study , 2006, Proceedings of the Design Automation & Test in Europe Conference.
[11] Srinivasan Murali,et al. Bandwidth-constrained mapping of cores onto NoC architectures , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.
[12] John Shalf,et al. Exascale Computing Technology Challenges , 2010, VECPAR.
[13] Scott Klasky,et al. Terascale direct numerical simulations of turbulent combustion using S3D , 2008 .
[14] David A. Padua,et al. Programming for parallelism and locality with hierarchically tiled arrays , 2006, PPoPP '06.
[15] Samuel Williams,et al. Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.
[16] Karl Fürlinger,et al. Expressing and Exploiting Multi-Dimensional Locality in DASH , 2016, Software for Exascale Computing.
[17] John Shalf,et al. Tiling as a Durable Abstraction for Parallelism and Data Locality , 2013 .
[18] Sanjay V. Rajopadhye,et al. Parameterized tiled loops for free , 2007, PLDI '07.
[19] Brice Goglin,et al. Managing the topology of heterogeneous cluster nodes with hardware locality (hwloc) , 2014, 2014 International Conference on High Performance Computing & Simulation (HPCS).
[20] Chun Chen,et al. Loop Transformation Recipes for Code Generation and Auto-Tuning , 2009, LCPC.
[21] Daniel Sunderland,et al. Manycore performance-portability: Kokkos multidimensional array library , 2012 .
[22] Chita R. Das,et al. Application-aware prioritization mechanisms for on-chip networks , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[23] Scott B. Baden,et al. Mint: realizing CUDA performance in 3D stencil methods with annotated C , 2011, ICS '11.
[24] Mateo Valero,et al. Breaking the bandwidth wall in chip multiprocessors , 2011, 2011 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation.
[25] Brian Rogers,et al. Scaling the bandwidth wall: challenges in and avenues for CMP scaling , 2009, ISCA '09.
[26] John Shalf,et al. BoxLib with Tiling: An AMR Software Framework , 2016, ArXiv.
[27] Samuel Williams,et al. ExaSAT: An exascale co-design tool for performance modeling , 2015, Int. J. High Perform. Comput. Appl..
[28] Haibo Chen,et al. Tiled-MapReduce: Efficient and Flexible MapReduce Processing on Multicore with Tiling , 2013, TACO.
[29] Mauro Bianco,et al. A Generic Strategy for Multi-stage Stencils , 2014, Euro-Par.