Spartan: A Distributed Array Framework with Smart Tiling
暂无分享,去创建一个
Jinyang Li | Zhen Xiao | Zhaoguo Wang | Qi Chen | Jorge Ortiz | Chien-Chin Huang | Russell Power | Russell Power | Jinyang Li | Jorge Ortiz | Zhen Xiao | Chien-chin Huang | Zhaoguo Wang | Qi Chen
[1] Yuan Yu,et al. Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.
[2] Franz Franchetti,et al. Data Layout Transformation for Stencil Computations on Short-Vector SIMD Architectures , 2011, CC.
[3] Joseph E. Gonzalez,et al. GraphLab: A New Parallel Framework for Machine Learning , 2010 .
[4] Scott Shenker,et al. Shark: SQL and rich analytics at scale , 2012, SIGMOD '13.
[5] Robert J. Harrison,et al. Global arrays: A nonuniform memory access programming model for high-performance computers , 1996, The Journal of Supercomputing.
[6] R Core Team,et al. R: A language and environment for statistical computing. , 2014 .
[7] Craig Chambers,et al. FlumeJava: easy, efficient data-parallel pipelines , 2010, PLDI '10.
[8] Ulrich Kremer,et al. NP-completeness of Dynamic Remapping , 1993 .
[9] Andrey Gubarev,et al. Dremel : Interactive Analysis of Web-Scale Datasets , 2011 .
[10] Jinyang Li,et al. Building fast, distributed programs with partitioned tables , 2010 .
[11] Aart J. C. Bik,et al. Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.
[12] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[13] Sanjay Ghemawat,et al. MapReduce: simplified data processing on large clusters , 2008, CACM.
[14] Scott Shenker,et al. Spark: Cluster Computing with Working Sets , 2010, HotCloud.
[15] Robert A. van de Geijn,et al. Elemental: A New Framework for Distributed Memory Dense Matrix Computations , 2013, TOMS.
[16] P. Sadayappan,et al. Communication-Free Hyperplane Partitioning of Nested Loops , 1993, J. Parallel Distributed Comput..
[17] Scott A. Mahlke,et al. Data Access Partitioning for Fine-grain Parallelism on Multicore Architectures , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[18] Guy L. Steele,et al. Data Optimization: Allocation of Arrays to Reduce Communication on SIMD Machines , 1990, J. Parallel Distributed Comput..
[19] Peter Baumann,et al. The multidimensional database system RasDaMan , 1998, SIGMOD '98.
[20] Carlos Maltzahn,et al. SciHadoop: Array-based query processing in Hadoop , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[21] Uday Bondhugula,et al. Data Layout Transformation for Enhancing Data Locality on NUCA Chip Multiprocessors , 2009, 2009 18th International Conference on Parallel Architectures and Compilation Techniques.
[22] Lawrence Snyder,et al. ZPL: An Array Sublanguage , 1993, LCPC.
[23] Marina C. Chen,et al. The Data Alignment Phase in Compiling Programs for Distrubuted-Memory Machines , 1991, J. Parallel Distributed Comput..
[24] Jean-Philippe Martin,et al. Dandelion: a compiler and runtime for heterogeneous systems , 2013, SOSP.
[25] John Cavazos,et al. Trace-Driven Memory Access Pattern Recognition in Computational Kernels , 2014 .
[26] John Glauert,et al. SISAL: streams and iteration in a single assignment language. Language reference manual, Version 1. 2. Revision 1 , 1985 .
[27] Steven Hand,et al. CIEL: A Universal Execution Engine for Distributed Data-Flow Computing , 2011, NSDI.
[28] Jingke Li,et al. Index domain alignment: minimizing cost of cross-referencing between distributed arrays , 1990, [1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation.
[29] Oscar R. Hernandez,et al. Open64-based Regular Stencil Shape Recognition in HERCULES , 2013 .
[30] Michael Philippsen,et al. Automatic alignment of array data and processes to reduce communication time on DMPPs , 1995, PPOPP '95.
[31] Yingwei Luo,et al. Optimal Cache Partition-Sharing , 2015, 2015 44th International Conference on Parallel Processing.
[32] Jinyang Li,et al. Piccolo: Building Fast, Distributed Programs with Partitioned Tables , 2010, OSDI.
[33] Michael Isard,et al. DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language , 2008, OSDI.
[34] Jacobi. Pattern Driven Automatic Parallelization , 2004 .
[35] Michael Stonebraker,et al. SciDB: A Database Management System for Applications with Complex Analytics , 2013, Computing in Science & Engineering.
[36] Marwan A. Jabri,et al. Automatic array alignment in parallel Matlab scripts , 1999, Proceedings 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing. IPPS/SPDP 1999.
[37] Keshav Pingali,et al. Solving Alignment Using Elementary Linear Algebra , 1994, LCPC.
[38] Santosh G. Abraham,et al. Compiler techniques for data partitioning of sequentially iterated parallel loops , 1990, ICS '90.
[39] M. Abadi,et al. Naiad: a timely dataflow system , 2013, SOSP.
[40] Gustavo Alonso,et al. Pydron: Semi-Automatic Parallelization for Multi-Core and the Cloud , 2014, OSDI.
[41] Katherine Yelick,et al. UPC Language Specifications V1.1.1 , 2003 .
[42] J. Ramanujam,et al. A methodology for parallelizing programs for multicomputers and complex memory multiprocessors , 1989, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).
[43] Guy E. Blelloch,et al. NESL: A Nested Data-Parallel Language (Version 2.6) , 1993 .
[44] Tim Kraska,et al. MLI: An API for Distributed Machine Learning , 2013, 2013 IEEE 13th International Conference on Data Mining.
[45] Ken Kennedy,et al. Automatic data layout for distributed-memory machines , 1998, TOPL.
[46] Jack Dongarra,et al. LAPACK: a portable linear algebra library for high-performance computers , 1990, SC.
[47] J. Ramanujam,et al. Compile-Time Techniques for Data Distribution in Distributed Memory Machines , 1991, IEEE Trans. Parallel Distributed Syst..
[48] Zhengping Qian,et al. MadLINQ: large-scale distributed matrix computation for the cloud , 2012, EuroSys '12.
[49] Robert R. Lewis,et al. Using the Global Arrays Toolkit to Reimplement NumPy for Distributed Computation , 2011 .
[50] John Glauert,et al. SISAL: streams and iteration in a single-assignment language. Language reference manual, Version 1. 1 , 1983 .
[51] Henry G. Dietz,et al. Reduction of Cache Coherence Overhead by Compiler Data Layout and Loop Transformation , 1991, LCPC.
[52] Paul M. Anderson. The Use and Limitations of Static-Analysis Tools to Improve Software Quality , 2008 .
[53] William Gropp,et al. Efficient Management of Parallelism in Object-Oriented Numerical Software Libraries , 1997, SciTools.
[54] Christopher Olston,et al. Interactive Analysis of Web-Scale Data , 2009, CIDR.
[55] Alan Edelman,et al. Parallel MATLAB: Doing it Right , 2005, Proceedings of the IEEE.
[56] Jean-Luc Gaudiot,et al. The Sisal model of functional programming and its implementation , 1997, Proceedings of IEEE International Symposium on Parallel Algorithms Architecture Synthesis.
[57] Robert W. Numrich,et al. Co-array Fortran for parallel programming , 1998, FORF.
[58] Guy E. Blelloch,et al. NESL: A Nested Data-Parallel Language , 1992 .
[59] Michael A. Frumkin,et al. Automatic Recognition of Performance Idioms in Scientific Applications , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.
[60] Murray Stokely,et al. Large-Scale Parallel Statistical Forecasting Computations in R , 2011 .
[61] Charles L. Lawson,et al. Basic Linear Algebra Subprograms for Fortran Usage , 1979, TOMS.
[62] Erik H. D'Hollander,et al. Partitioning and Labeling of Index Sets in DO Loops with Constant Dependence Vectors , 1989, ICPP.
[63] Jack Dongarra,et al. ScaLAPACK: a scalable linear algebra library for distributed memory concurrent computers , 1992, [Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation.
[64] Alvin AuYoung,et al. Presto: distributed machine learning and graph processing with sparse matrices , 2013, EuroSys '13.