Tiling and Scheduling of Three-level Perfectly Nested Loops with Dependencies on Heterogeneous Systems
暂无分享,去创建一个
Jaber Karimpour | Shahriar Lotfi | Leili Mohammad Khanli | Ebrahim Zarei Zefreh | L. M. Khanli | J. Karimpour | S. Lotfi | E. Zefreh
[1] Monica S. Lam,et al. A Loop Transformation Theory and an Algorithm to Maximize Parallelism , 1991, IEEE Trans. Parallel Distributed Syst..
[2] Hesham El-Rewini,et al. Advanced Computer Architecture and Parallel Processing , 2005 .
[3] 华中科技大学,et al. 华中科技大学学報 = Journal of Huazhong University of Science and Technology , 2001 .
[4] Achim Basermann. Parallelizing iterative solvers for sparse systems of equations and eigenproblems on distributed-memory machines , 1994 .
[5] Jennifer Widom,et al. PARALLEL AND DISTRIBUTED SYSTEMS , 2010 .
[6] Soon Cheol Park. Efficient Data Structures and Algorithms for Scientific Computations. , 1991 .
[7] Nawwaf N. Kharma,et al. An Efficient Genetic Algorithm for Task Scheduling in Heterogeneous Distributed Computing Systems , 2006, 2006 IEEE International Conference on Evolutionary Computation.
[8] Chao-Tung Yang,et al. Implementation of a Performance-Based Loop Scheduling on Heterogeneous Clusters , 2009, ICA3PP.
[9] Anthony T. Chronopoulos,et al. Joint rate and power control with pricing , 2005, GLOBECOM '05. IEEE Global Telecommunications Conference, 2005..
[10] Tony Kai Yun Chan,et al. Task partitionings for parallel triangular solver on a MIMD computer , 1996, Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing.
[11] Anthony T. Chronopoulos,et al. Dynamic multi phase scheduling for heterogeneous clusters , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.
[12] Uday Bondhugula. Compiling affine loop nests for distributed-memory parallel architectures , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[13] Alexey L. Lastovetsky. Heterogeneity in parallel and distributed computing , 2013, J. Parallel Distributed Comput..
[14] Hui Liu,et al. HSIP: A Novel Task Scheduling Algorithm for Heterogeneous Computing , 2016, Sci. Program..
[15] Saeed Parsa,et al. Locality-Conscious Nested-Loops Parallelization , 2014 .
[16] Roland Glowinski,et al. Computational science for the 21st Century , 1997 .
[17] Yves Robert,et al. Matrix Multiplication on Heterogeneous Platforms , 2001, IEEE Trans. Parallel Distributed Syst..
[18] Sascha M. Schnepp,et al. Pipelined, Flexible Krylov Subspace Methods , 2015, SIAM J. Sci. Comput..
[19] Xiaorong Li,et al. A Sequential Cooperative Game Theoretic Approach to Storage-Aware Scheduling of Multiple Large-Scale Workflow Applications in Grids , 2012, 2012 ACM/IEEE 13th International Conference on Grid Computing.
[20] Safia Kedad-Sidhoum,et al. Scheduling independent tasks on multi‐cores with GPU accelerators , 2015, Concurr. Comput. Pract. Exp..
[21] Yong Wang,et al. A Task Allocation Schema Based on Response Time Optimization in Cloud Computing , 2014, ArXiv.
[22] Carsten F. Ball,et al. Smart Quality Enhancement in High Capacity Geran Networks , 2006, 2006 IEEE 17th International Symposium on Personal, Indoor and Mobile Radio Communications.
[23] Alexey L. Lastovetsky,et al. High Performance Heterogeneous Computing , 2009, Wiley series on parallel and distributed computing.
[24] Sébastien Le Digabel. NOMAD: Nonlinear Optimization with the MADS Algorithm , 2009 .
[25] Shaoyi Song,et al. Research on Load Balancing in Cloud Computing Based on Marketing Theory , 2013 .
[26] Anthony T. Chronopoulos,et al. Towards the optimal synchronization granularity for dynamic scheduling of pipelined computations on heterogeneous computing systems , 2012, Concurr. Comput. Pract. Exp..
[27] Maurice Clint,et al. The Computation of Partial Eigensolutions on a Distributed Memory Machine Using a Modified Lanzos Method , 1996, Euro-Par, Vol. II.
[28] Yves Robert,et al. Algorithmic Issues on Heterogeneous Computing Platforms , 1999, Parallel Process. Lett..
[29] Sebastián Reyes,et al. A Quadratic Self-Scheduling Algorithm for Heterogeneous Distributed Computing Systems , 2006, 2006 IEEE International Conference on Cluster Computing.
[30] Panayiotis Tsanakas,et al. Dynamic scheduling of nested loops with uniform dependencies in heterogeneous networks of workstations , 2005, 8th International Symposium on Parallel Architectures,Algorithms and Networks (ISPAN'05).
[31] Pen-Chung Yew,et al. Tile size selection revisited , 2013, ACM Trans. Archit. Code Optim..
[32] Mitsuo Gen,et al. Genetic algorithms and engineering optimization , 1999 .
[33] Shiping Chen,et al. Partitioning and scheduling loops on NOWs , 1999, Comput. Commun..
[34] Jingling Xue. Communication-Minimal Tiling of Uniform Dependence Loops , 1997, J. Parallel Distributed Comput..
[35] H. Martin Bücker,et al. Reducing global synchronization in the biconjugate gradient method , 1999 .
[36] Anthony T. Chronopoulos,et al. Enhancing self-scheduling algorithms via synchronization and weighting , 2008, J. Parallel Distributed Comput..
[37] Neeraj Pandey. Comparative Analysis of Job Scheduling for Grid Environment , 2013 .
[38] Chau-Wen Tseng,et al. Tiling Optimizations for 3D Scientific Computations , 2000, ACM/IEEE SC 2000 Conference (SC'00).
[39] F. Castejón,et al. Simulations of fast ions distribution in stellarators based on coupled Monte Carlo fuelling and orbit codes , 2013 .
[40] Hong He,et al. Honeybee Mating Optimization Algorithm For Task Assignment In Heterogeneous Computing Systems , 2013, Intell. Autom. Soft Comput..
[41] K Shahu Chatrapati. Competitive equilibrium approach for load balanicing a grid network , 2011 .
[42] Pamela L. Eddy. COLLEGE ' OF WILLIAM AND MARY , 2004 .
[43] Jaber Karimpour,et al. 3‐D data partitioning for 3‐level perfectly nested loops on heterogeneous distributed systems , 2017, Concurr. Comput. Pract. Exp..
[44] Gang Wei,et al. Game-theoretic rate allocation with balanced traffic in collaborative transmission over heterogeneous wireless access networks , 2012, IET Commun..
[45] A. Peirce. Computer Methods in Applied Mechanics and Engineering , 2010 .
[46] Xili Wang. A novel approach of solving the CNF-SAT problem , 2013, ArXiv.
[47] Gerhard Wellein,et al. Introduction to High Performance Computing for Scientists and Engineers , 2010, Chapman and Hall / CRC computational science series.
[48] Yves Raynaud,et al. Integrated Network Management IV , 1995, IFIP — The International Federation for Information Processing.
[49] Yves Robert,et al. Static tiling for heterogeneous computing platforms , 1999, Parallel Comput..
[50] Jeffrey S. Vetter,et al. Examining recent many-core architectures and programming models using SHOC , 2015, PMBS '15.
[51] Cristina L. Abad,et al. DARE: Adaptive Data Replication for Efficient Cluster Scheduling , 2011, 2011 IEEE International Conference on Cluster Computing.
[52] Wim Vanroose,et al. Improving the arithmetic intensity of multigrid with the help of polynomial smoothers , 2012, Numer. Linear Algebra Appl..
[53] Theodore Andronikos,et al. Distributed dynamic load balancing for pipelined computations on heterogeneous systems , 2011, Parallel Comput..
[54] David Padua,et al. Encyclopedia of Parallel Computing , 2011 .
[55] Minyi Guo,et al. Optimally Maximizing Iteration-Level Loop Parallelism , 2012, IEEE Transactions on Parallel and Distributed Systems.
[56] Saeed Parsa,et al. A New Genetic Algorithm for Loop Tiling , 2006, The Journal of Supercomputing.
[57] Markus Kowarschik,et al. An Overview of Cache Optimization Techniques and Cache-Aware Numerical Algorithms , 2002, Algorithms for Memory Hierarchies.
[58] Yiming Yang,et al. A Secure File Allocation Algorithm for Heterogeneous Distributed Systems , 2011, 2011 40th International Conference on Parallel Processing Workshops.
[59] Simon Miles,et al. Cluster Computing and Grid (CCGrid) , 2005 .
[60] Marcel Bauer,et al. Numerical Methods for Partial Differential Equations , 1994 .
[61] Sivasankaran Rajamanickam,et al. Domain Decomposition Preconditioners for Communication-Avoiding Krylov Methods on a Hybrid CPU/GPU Cluster , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.
[62] Luis Pastor,et al. Parallel CBIR implementations with load balancing algorithms , 2006, J. Parallel Distributed Comput..
[63] Anisaara Nadaph,et al. Methodical Analysis of Various Balancer Conditions on Public Cloud Division , 2015, 2015 International Conference on Computing Communication Control and Automation.
[64] Daniel Grosu,et al. Incentive-centered design for scheduling in parallel and distributed systems , 2009 .
[65] Geoffrey C. Fox,et al. Distributed and Cloud Computing: From Parallel Processing to the Internet of Things , 2011 .
[66] Unsymmetric Linear,et al. A BLOCK VARIANT OF THE GMRES METHOD FOR , 1996 .
[67] Tinku Mohamed Rasheed,et al. Power control game for spectrum sharing in public safety communications , 2013, 2013 IEEE 18th International Workshop on Computer Aided Modeling and Design of Communication Links and Networks (CAMAD).
[68] Xing Zhou,et al. Optimal Parallelogram Selection for Hierarchical Tiling , 2015, ACM Trans. Archit. Code Optim..
[69] P. Sadayappan,et al. Nested Loop Tiling for Distributed Memory Machines , 1990, Proceedings of the Fifth Distributed Memory Computing Conference, 1990..
[70] Michael J. Quinn,et al. Three-dimensional grid partitioning for network parallel processing , 1994, CSC '94.
[71] Ioana Banicescu,et al. A Load Balancing Tool for Distributed Parallel Loops , 2003, Proceedings of the International Workshop on Challenges of Large Applications in Distributed Environments, 2003..
[72] H. A. van der Vorst,et al. PARALLEL LINEAR SYSTEMS SOLVERS: SPARSE ITERATIVE METHODS , 1996 .
[73] Alexey L. Lastovetsky,et al. Data Partitioning with a Functional Performance Model of Heterogeneous Processors , 2007, Int. J. High Perform. Comput. Appl..
[74] Sudarshan S. Deshmukh,et al. Improved Queuing Mechanism for Hybrid Load Balancing Scheme in Interactive Application , 2013 .
[75] T. Manteuffel,et al. Adaptive polynomial preconditioning for hermitian indefinite linear systems , 1989 .
[76] Li Cheng,et al. A Novel Load Balancing Optimization Algorithm Based on Peer-to-Peer Technology in Streaming Media , 2012 .
[77] Monica S. Lam,et al. A data locality optimizing algorithm , 1991, PLDI '91.
[78] James Demmel,et al. Avoiding Communication in Nonsymmetric Lanczos-Based Krylov Subspace Methods , 2013, SIAM J. Sci. Comput..
[79] Anthony T. Chronopoulos,et al. Optimal synchronization frequency for dynamic pipelined computations on heterogeneous systems , 2007, 2007 IEEE International Conference on Cluster Computing.
[80] Anthony T. Chronopoulos,et al. Studying the impact of synchronization frequency on scheduling tasks with dependencies in heterogeneous systems , 2007, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).
[81] Sajal K. Das,et al. A Case Study-based Performance Evaluation Framework for CSCF Processes on a Blade-Server , 2007, International Conference on Networking and Services (ICNS '07).
[82] Sevin Fide,et al. A middleware approach for pipelining communications in clusters , 2007, Cluster Computing.
[83] Yves Robert,et al. Determining the idle time of a tiling: new results , 1997, Proceedings 1997 International Conference on Parallel Architectures and Compilation Techniques.
[84] James R. Cloutier,et al. Periodically preconditioned conjugate gradient-restoration algorithm for optimal control - The direct approach , 1996 .