Improving Execution Concurrency of Large-Scale Matrix Multiplication on Distributed Data-Parallel Platforms
暂无分享,去创建一个
Rong Gu | Chen Tian | Yihua Huang | Yun Tang | Xudong Zheng | Hucheng Zhou | Guanru Li | Hucheng Zhou | Rong Gu | Y. Huang | Xudong Zheng | Guanru Li | Yun Tang | Chen Tian
[1] Jiannong Cao,et al. MatrixMap: Programming Abstraction and Implementation of Matrix Computation for Big Data Applications , 2015, 2015 IEEE 21st International Conference on Parallel and Distributed Systems (ICPADS).
[2] Yu Cao,et al. HeteroSpark: A heterogeneous CPU/GPU Spark platform for machine learning algorithms , 2015, 2015 IEEE International Conference on Networking, Architecture and Storage (NAS).
[3] Michael Stonebraker,et al. SciDB: A Database Management System for Applications with Complex Analytics , 2013, Computing in Science & Engineering.
[4] Jin-Soo Kim,et al. HAMA: An Efficient Matrix Computation with the MapReduce Framework , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.
[5] Jack J. Dongarra,et al. Automatically Tuned Linear Algebra Software , 1998, Proceedings of the IEEE/ACM SC98 Conference.
[6] James Demmel,et al. Communication-optimal parallel algorithm for strassen's matrix multiplication , 2012, SPAA '12.
[7] Bin Cui,et al. Exploiting Matrix Dependency for Efficient Distributed Matrix Computation , 2015, SIGMOD Conference.
[8] Fabian Hueske,et al. Apache Flink , 2019, Encyclopedia of Big Data Technologies.
[9] Reynold Xin,et al. GraphX: Graph Processing in a Distributed Dataflow Framework , 2014, OSDI.
[10] A. Davidson. Optimizing Shuffle Performance in Spark , 2013 .
[11] Shirish Tatikonda,et al. SystemML: Declarative machine learning on MapReduce , 2011, 2011 IEEE 27th International Conference on Data Engineering.
[12] Kadir Akbudak,et al. Locality-Aware Parallel Sparse Matrix-Vector and Matrix-Transpose-Vector Multiplication on Many-Core Processors , 2016, IEEE Transactions on Parallel and Distributed Systems.
[13] Marcos Dias de Assunção,et al. Apache Spark , 2019, Encyclopedia of Big Data Technologies.
[14] John Canny,et al. BIDMach: Large-scale Learning with Zero Memory Allocation , 2013 .
[15] Robert A. van de Geijn,et al. SUMMA: scalable universal matrix multiplication algorithm , 1995, Concurr. Pract. Exp..
[16] Rong Gu,et al. Efficient large scale distributed matrix computation with spark , 2015, 2015 IEEE International Conference on Big Data (Big Data).
[17] Jinyang Li,et al. Spartan: A Distributed Array Framework with Smart Tiling , 2015, USENIX Annual Technical Conference.
[18] Jinyang Li,et al. Piccolo: Building Fast, Distributed Programs with Partitioned Tables , 2010, OSDI.
[19] Yuan Yu,et al. Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.
[20] Alexander Tiskin,et al. Memory-Efficient Matrix Multiplication in the BSP Model , 1999, Algorithmica.
[21] Shirish Tatikonda,et al. Resource Elasticity for Large-Scale Machine Learning , 2015, SIGMOD Conference.
[22] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[23] Joseph M. Hellerstein,et al. GraphLab: A New Framework For Parallel Machine Learning , 2010, UAI.
[24] Zhengping Qian,et al. MadLINQ: large-scale distributed matrix computation for the cloud , 2012, EuroSys '12.
[25] James Demmel,et al. Communication-Optimal Parallel Recursive Rectangular Matrix Multiplication , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.
[26] R. C. Whaley,et al. Automatically Tuned Linear Algebra Software (ATLAS) , 2011, Encyclopedia of Parallel Computing.
[27] Peter J. Haas,et al. Ricardo: integrating R and Hadoop , 2010, SIGMOD Conference.
[28] Joseph Gonzalez,et al. PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs , 2012, OSDI.
[29] Yangqing Jia,et al. Learning Semantic Image Representations at a Large Scale , 2014 .
[30] Michael J. Franklin,et al. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.
[31] James Demmel,et al. Matrix Multiplication on Multidimensional Torus Networks , 2012, VECPAR.
[32] Jack Dongarra,et al. ScaLAPACK: a scalable linear algebra library for distributed memory concurrent computers , 1992, [Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation.
[33] Alvin AuYoung,et al. Presto: distributed machine learning and graph processing with sparse matrices , 2013, EuroSys '13.
[34] Tim Kraska,et al. MLI: An API for Distributed Machine Learning , 2013, 2013 IEEE 13th International Conference on Data Mining.
[35] Matei Zaharia,et al. Matrix Computations and Optimization in Apache Spark , 2015, KDD.