Performance Analysis of Sparse Matrix-Vector Multiplication (SpMV) on Graphics Processing Units (GPUs)
暂无分享,去创建一个
[1] David S. Wise,et al. Experiments with Quadtree Representation of Matrices , 1988, ISSAC.
[2] Rashid Mehmood,et al. SURAA: A Novel Method and Tool for Loadbalanced and Coalesced SpMV Computations on GPUs , 2019 .
[3] Marta Z. Kwiatkowska,et al. A Symbolic Out-of-Core Solution Method for Markov Models , 2002, Electron. Notes Theor. Comput. Sci..
[4] William Gropp,et al. Applications of the streamed storage format for sparse matrix operations , 2014, Int. J. High Perform. Comput. Appl..
[5] Kurt Keutzer,et al. clSpMV: A Cross-Platform OpenCL SpMV Framework on GPUs , 2012, ICS '12.
[6] Jaafar M. H. Elmirghani,et al. Performance Evaluation of a Metro WDM Multi-channel Ring Network with Variable-length Packets , 2007, 2007 IEEE International Conference on Communications.
[7] Rashid Mehmood,et al. Smarter Traffic Prediction Using Big Data, In-Memory Computing, Deep Learning and GPUs , 2019, Sensors.
[8] Beata Bylina,et al. A Markovian Model of a Network of Two Wireless Devices , 2012, CN.
[9] Eric S. Chung,et al. Towards a Universal FPGA Matrix-Vector Multiplication Architecture , 2012, 2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines.
[10] Michael Garland,et al. Merge-Based Parallel Sparse Matrix-Vector Multiplication , 2016, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.
[11] Frédéric Magoulès,et al. Alinea: An Advanced Linear Algebra Library for Massively Parallel Computations on Graphics Processing Units , 2015, Int. J. High Perform. Comput. Appl..
[12] Pavel Tvrdík,et al. Evaluation Criteria for Sparse Matrix Storage Formats , 2016, IEEE Transactions on Parallel and Distributed Systems.
[13] Rashid Mehmood,et al. Big data logistics: a health-care transport capacity sharing model , 2015 .
[14] Jason D. Bakos,et al. Exploiting Matrix Symmetry to Improve FPGA-Accelerated Conjugate Gradient , 2009, 2009 17th IEEE Symposium on Field Programmable Custom Computing Machines.
[15] Dejan Markovic,et al. A scalable sparse matrix-vector multiplication kernel for energy-efficient sparse-blas on FPGAs , 2014, FPGA.
[16] Ümit V. Çatalyürek,et al. Performance Evaluation of Sparse Matrix Multiplication Kernels on Intel Xeon Phi , 2013, PPAM.
[17] K. M. Azharul Hasan,et al. Efficient storage scheme for n-dimensional sparse array: GCRS/GCCS , 2015, 2015 International Conference on High Performance Computing & Simulation (HPCS).
[18] George A. Constantinides,et al. Optimizing memory bandwidth use and performance for matrix-vector multiplication in iterative methods , 2011, TRETS.
[19] P. Sadayappan,et al. Effective Machine Learning Based Format Selection and Performance Modeling for SpMV on GPUs , 2018, 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).
[20] Srinivasan Parthasarathy,et al. Fast Sparse Matrix-Vector Multiplication on GPUs for Graph Applications , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.
[21] Rashid Mehmood,et al. Computational Markovian analysis of large systems , 2011 .
[22] Taisir E.H. El-Gorashi,,et al. A Mirroring Strategy for SANs in a Metro WDM Sectioned Ring Architecture under Different Traffic Scenarios , 2008 .
[23] Georgi Kuzmanov,et al. Reconfigurable sparse/dense matrix-vector multiplier , 2009, 2009 International Conference on Field-Programmable Technology.
[24] Sherali Zeadally,et al. Multimedia applications over metropolitan area networks (MANs) , 2011, J. Netw. Comput. Appl..
[25] Srinivasan Parthasarathy,et al. Automatic Selection of Sparse Matrix Representation on GPUs , 2015, ICS.
[26] A. N. Yzelman. Generalised vectorisation for sparse matrix: vector multiplication , 2015, IA3@SC.
[27] Rashid Mehmood,et al. UbeHealth: A Personalized Ubiquitous Cloud and Edge-Enabled Networked Healthcare System for Smart Cities , 2018, IEEE Access.
[28] Davide Barbieri,et al. Sparse Matrix-Vector Multiplication on GPGPUs , 2017, ACM Trans. Math. Softw..
[29] Feng Shi,et al. Sparse Matrix Format Selection with Multiclass SVM for SpMV on GPU , 2016, 2016 45th International Conference on Parallel Processing (ICPP).
[30] Athanasios Fevgas,et al. Efficient solution of large sparse linear systems in modern hardware , 2015, 2015 6th International Conference on Information, Intelligence, Systems and Applications (IISA).
[31] Fangfang Li,et al. Efficient sparse matrix-vector multiplication using cache oblivious extension quadtree storage format , 2016, Future Gener. Comput. Syst..
[32] Michele Martone,et al. Efficient multithreaded untransposed, transposed or symmetric sparse matrix-vector multiplication with the Recursive Sparse Blocks format , 2014, Parallel Comput..
[33] Rashid Mehmood,et al. ZAKI: A Smart Method and Tool for Automatic Performance Optimization of Parallel SpMV Computations on Distributed Memory Machines , 2019, Mobile Networks and Applications.
[34] Rashid Mehmood,et al. Exploring the influence of big data on city transport operations: a Markovian approach , 2017 .
[35] Mario Di Francesco,et al. Distributed Inference Acceleration with Adaptive DNN Partitioning and Offloading , 2020, IEEE INFOCOM 2020 - IEEE Conference on Computer Communications.
[36] Brian Vinter,et al. CSR5: An Efficient Storage Format for Cross-Platform Sparse Matrix-Vector Multiplication , 2015, ICS.
[37] Frédéric Magoulès,et al. Efficient implementation of Jacobi iterative method for large sparse linear systems on graphic processing units , 2017, The Journal of Supercomputing.
[38] Frédéric Magoulès,et al. Fast sparse matrix-vector multiplication on graphics processing unit for finite element analysis , 2012, 2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems.
[39] Mitsuo Gen,et al. Accelerating genetic algorithms with GPU computing: A selective overview , 2019, Comput. Ind. Eng..
[40] Goran Flegar,et al. Overcoming Load Imbalance for Irregular Sparse Matrices , 2017, IA3@SC.
[41] André DeHon,et al. Floating-point sparse matrix-vector multiply for FPGAs , 2005, FPGA '05.
[42] Weixing Ji,et al. Sparse matrix partitioning for optimizing SpMV on CPU-GPU heterogeneous platforms , 2020, Int. J. High Perform. Comput. Appl..
[43] Rashid Mehmood,et al. Rapid Transit Systems: Smarter Urban Planning Using Big Data, In-Memory Computing, Deep Learning, and GPUs , 2019, Sustainability.
[44] Zhang Qian,et al. A new method of Sparse Matrix-Vector Multiplication on GPU , 2012, Proceedings of 2012 2nd International Conference on Computer Science and Network Technology.
[45] Jeffrey S. Vetter,et al. A Survey of CPU-GPU Heterogeneous Computing Techniques , 2015, ACM Comput. Surv..
[46] Wayne Luk,et al. Accelerating SpMV on FPGAs by Compressing Nonzero Values , 2015, 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines.
[47] David M. Lucantoni,et al. A Markov Modulated Characterization of Packetized Voice and Data Traffic and Related Statistical Multiplexer Performance , 1986, IEEE J. Sel. Areas Commun..
[48] Kenli Li,et al. Performance Analysis and Optimization for SpMV on GPU Using Probabilistic Modeling , 2015, IEEE Transactions on Parallel and Distributed Systems.
[49] Rashid Mehmood,et al. Performance Characteristics for Sparse Matrix-Vector Multiplication on GPUs , 2020 .
[50] Rashid Mehmood,et al. ZAKI+: A Machine Learning Based Process Mapping Tool for SpMV Computations on Distributed Memory Architectures , 2019, IEEE Access.
[51] Rashid Mehmood,et al. Parallel Iterative Solution of Large Sparse Linear Equation Systems on the Intel MIC Architecture , 2019, Smart Infrastructure and Applications.
[52] Wu-chun Feng,et al. Auto-Tuning Strategies for Parallelizing Sparse Matrix-Vector (SpMV) Multiplication on Multi- and Many-Core Processors , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).
[53] Amy Nicole Langville,et al. A Survey of Eigenvector Methods for Web Information Retrieval , 2005, SIAM Rev..
[54] W VuducRichard,et al. Model-driven autotuning of sparse matrix-vector multiply on GPUs , 2010 .
[55] Timothy A. Davis,et al. The university of Florida sparse matrix collection , 2011, TOMS.
[56] Wayne Luk,et al. Optimising Sparse Matrix Vector multiplication for large scale FEM problems on FPGA , 2016, 2016 26th International Conference on Field Programmable Logic and Applications (FPL).
[57] Feng Shi,et al. BestSF , 2018, ACM Trans. Archit. Code Optim..
[58] Liqiang Wang,et al. Auto-Tuning CUDA Parameters for Sparse Matrix-Vector Multiplication on GPUs , 2010, 2010 International Conference on Computational and Information Sciences.
[59] Ping Guo,et al. A Performance Modeling and Optimization Analysis Tool for Sparse Matrix-Vector Multiplication on GPUs , 2014, IEEE Transactions on Parallel and Distributed Systems.
[60] Peter Luksch,et al. Analysis of Sparse Matrix-Vector Multiplication Using Iterative Method in CUDA , 2013, 2013 IEEE Eighth International Conference on Networking, Architecture and Storage.
[61] Walid A. Abu-Sufah,et al. An Effective Approach for Implementing Sparse Matrix-Vector Multiplication on Graphics Processing Units , 2012, 2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems.
[62] Kenli Li,et al. A hybrid computing method of SpMV on CPU-GPU heterogeneous computing systems , 2017, J. Parallel Distributed Comput..
[63] J. Elmirghani,et al. A data Mirroring technique for SANs in a Metro WDM sectioned ring , 2008, 2008 International Conference on Optical Network Design and Modeling.
[64] Gerhard Wellein,et al. A Unified Sparse Matrix Data Format for Efficient General Sparse Matrix-Vector Multiplication on Modern Processors with Wide SIMD Units , 2013, SIAM J. Sci. Comput..
[65] Scott A. Mahlke,et al. Scalpel: Customizing DNN pruning to the underlying hardware parallelism , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).