Memory aware load balance strategy on a parallel branch‐and‐bound application
暂无分享,去创建一个
Lúcia Maria de A. Drummond | Cristina Boeres | Artur Alves Pessoa | Juliana M. N. Silva | A. Pessoa | Lúcia M. A. Drummond | Cristina Boeres | Juliana M. N. Silva
[1] Jack J. Dongarra,et al. Analytical modeling and optimization for affinity based thread scheduling on multicore systems , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.
[2] Alfred V. Aho,et al. Proceedings of the nineteenth annual ACM symposium on Theory of computing , 1987, STOC 1987.
[3] Inmaculada García,et al. Adaptive parallel interval branch and bound algorithms based on their performance for multicore architectures , 2011, The Journal of Supercomputing.
[4] Jack Dongarra,et al. Analytical Modeling for Affinity-Based Thread Scheduling on Multicore Platforms ∗ , 2008 .
[5] Ben H. H. Juurlink,et al. The Parallel Hierarchical Memory Model , 1994, SWAT.
[6] Mohammad Zubair,et al. A unified model for multicore architectures , 2008, IFMT '08.
[7] Mihai Budiu,et al. DryadOpt: Branch-and-Bound on Distributed Data-Parallel Execution Engines , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.
[8] Viktor K. Prasanna,et al. Hierarchical Scheduling of DAG Structured Computations on Manycore Processors with Dynamic Thread Grouping , 2010, JSSPP.
[9] Bowen Alpern,et al. The uniform memory hierarchy model of computation , 2005, Algorithmica.
[10] Mihalis Yannakakis,et al. Towards an architecture-independent analysis of parallel algorithms , 1990, STOC '88.
[11] Jonghyun Park,et al. Parallel Skyline Computation on Multicore Architectures , 2009, ICDE.
[12] Bilel Derbel,et al. Overlay-Centric Load Balancing: Applications to UTS and B&B , 2012, 2012 IEEE International Conference on Cluster Computing.
[13] Richard Cole,et al. The APRAM: incorporating asynchrony into the PRAM model , 1989, SPAA '89.
[14] Cho-Li Wang,et al. Realistic communication model for parallel computing on cluster , 1999, ICWC 99. IEEE Computer Society International Workshop on Cluster Computing.
[15] Yossi Matias,et al. The QRQW PRAM: accounting for contention in parallel algorithms , 1994, SODA '94.
[16] Yuji Shinano,et al. A generalized utility for parallel branch and bound algorithms , 1995, Proceedings.Seventh IEEE Symposium on Parallel and Distributed Processing.
[17] Marco A. Boschetti,et al. A dual ascent procedure for the set partitioning problem , 2008, Discret. Optim..
[18] Craig A. Knoblock,et al. Advanced Programming in the UNIX Environment , 1992, Addison-Wesley professional computing series.
[19] Bertrand Le Cun,et al. A Parallel Exact Solver for the Three-Index Quadratic Assignment Problem , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.
[20] Alok Aggarwal,et al. Hierarchical memory with block transfer , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).
[21] Fumihiko Ino,et al. LogGPS: a parallel computational model for synchronization analysis , 2001, PPoPP '01.
[22] Dhabaleswar K. Panda,et al. Understanding the Impact of Multi-Core Architecture in Cluster Computing: A Case Study with Intel Dual-Core System , 2007, Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07).
[23] Mohammad Zubair,et al. Evaluating multicore algorithms on the unified memory model , 2009, Sci. Program..
[24] Bowen Alpern,et al. Modeling parallel computers as memory hierarchies , 1993, Proceedings of Workshop on Programming Models for Massively Parallel Computers.
[25] Lúcia Maria de A. Drummond,et al. A grid-enabled distributed branch-and-bound algorithm with application on the Steiner Problem in graphs , 2006, Parallel Comput..
[26] Cynthia A. Phillips,et al. PICO: An Object-Oriented Framework for Branch and Bound , 2000 .
[27] Ramesh Subramonian,et al. LogP: a practical model of parallel computation , 1996, CACM.
[28] Jack J. Dongarra,et al. Accurate Cache and TLB Characterization Using Hardware Counters , 2004, International Conference on Computational Science.
[29] Phillip B. Gibbons. A more practical PRAM model , 1989, SPAA '89.
[30] Michael Dahlin,et al. Emulations between QSM, BSP, and LogP: a framework for general-purpose parallel algorithm design , 1999, SODA '99.
[31] Cynthia A. Phillips,et al. Pico: An Object-Oriented Framework for Parallel Branch and Bound * , 2001 .
[32] Mohammad Zubair,et al. Evaluating multicore algorithms on the unified memory model , 2009 .
[33] Michael A. Bauer,et al. Parallel Branch and Bound Algorithm - A comparison between serial, OpenMP and MPI implementations , 2010 .
[34] Tiffani L. Williams,et al. The Heterogeneous Bulk Synchronous Parallel Model , 2000, IPDPS Workshops.
[35] Apan Qasem,et al. An Evaluation of Parallel Knapsack Algorithms on Multicore Architectures , 2010, CSC.
[36] Cristina Boeres,et al. On the Feasibility of Dynamically Scheduling DAG Applications on Shared Heterogeneous Systems , 2009, Euro-Par.
[37] Guillaume Mercier,et al. Towards an Efficient Process Placement Policy for MPI Applications in Multicore Environments , 2009, PVM/MPI.
[38] Hisham El-Shishiny,et al. Proceedings of the 1st international forum on Next-generation multicore/manycore technologies , 2008 .
[39] Stephen A. Rago,et al. Advanced Programming in the UNIX(R) Environment (2nd Edition) , 2005 .
[40] Yossi Matias,et al. Can shared-memory model serve as a bridging model for parallel computation? , 1997, SPAA '97.
[41] Leslie G. Valiant,et al. A bridging model for multi-core computing , 2008, J. Comput. Syst. Sci..
[42] El-Ghazali Talbi,et al. Hierarchical branch and bound algorithm for computational grids , 2012, Future Gener. Comput. Syst..
[43] Jack J. Dongarra,et al. Dynamic task scheduling for linear algebra algorithms on distributed-memory multicore systems , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[44] Dorit S. Hochba,et al. Approximation Algorithms for NP-Hard Problems , 1997, SIGA.
[45] Alan Jay Smith,et al. Measuring Cache and TLB Performance and Their Effect on Benchmark Runtimes , 1995, IEEE Trans. Computers.
[46] Vijaya Ramachandran,et al. QSM: A General Purpose Shared-Memory Model for Parallel Computation , 1997, FSTTCS.
[47] Paul G. Spirakis,et al. BSP vs LogP , 1996, SPAA '96.
[48] El-Ghazali Talbi,et al. An adaptive hierarchical master-worker (AHMW) framework for grids - Application to B&B algorithms , 2012, J. Parallel Distributed Comput..
[49] Joseph JáJá,et al. An Introduction to Parallel Algorithms , 1992 .
[50] Catherine Roucairol,et al. Bob++: Framework for Solving Optimization Problems with Branch-and-Bound methods , 2006, 2006 15th IEEE International Conference on High Performance Distributed Computing.
[51] Mary K. Vernon,et al. LoPC: modeling contention in parallel algorithms , 1997, PPOPP '97.
[52] Steven Fortune,et al. Parallelism in random access machines , 1978, STOC.
[53] Dorit S. Hochbaum,et al. Approximation Algorithms for NP-Hard Problems , 1996 .
[54] Lingjia Tang,et al. Contentiousness vs. sensitivity: improving contention aware runtime systems on multicore architectures , 2011, EXADAPT '11.
[55] Sadaf R. Alam,et al. Characterization of Scientific Workloads on Systems with Multi-Core Processors , 2006, 2006 IEEE International Symposium on Workload Characterization.
[56] Kwan-Liu Ma,et al. Parallel volume ray-casting for unstructured-grid data on distributed-memory architectures , 1995, PRS.
[57] Juan Touriño,et al. Servet: A benchmark suite for autotuning on multicore clusters , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).
[58] Rong Ge,et al. $\log_{\rm n}{\rm P}$ and $\log_{3}{\rm P}$: Accurate Analytical Models of Point-to-Point Communication in Distributed Systems , 2007, IEEE Transactions on Computers.
[59] Lingjia Tang,et al. Directly characterizing cross core interference through contention synthesis , 2011, HiPEAC.
[60] Chris J. Scheiman,et al. LogGP: incorporating long messages into the LogP model—one step closer towards a realistic model for parallel computation , 1995, SPAA '95.
[61] Bowen Alpern,et al. A model for hierarchical memory , 1987, STOC.
[62] Bruce M. Maggs,et al. Proceedings of the 28th Annual Hawaii International Conference on System Sciences- 1995 Models of Parallel Computation: A Survey and Synthesis , 2022 .
[63] K. Cameron,et al. lognP and log3P: Accurate Analytical Models of Point-to- point Communication in Distributed Systems , 2006 .
[64] 品野勇治. A Generalized Utility for Parallel Branch-and-Bound Algorithms(並列分枝限定法システムの汎用化) , 1997 .
[65] Xiaofang Zhao,et al. Accurate Analytical Models for Message Passing on Multi-core Clusters , 2009, 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing.
[66] Eduard Ayguadé,et al. Impact of the Memory Hierarchy on Shared Memory Architectures in Multicore Programming Models , 2009, 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing.
[67] S. Sitharama Iyengar,et al. Introduction to parallel algorithms , 1998, Wiley series on parallel and distributed computing.
[68] Leslie G. Valiant,et al. A bridging model for parallel computation , 1990, CACM.
[69] Mihalis Yannakakis,et al. Towards an Architecture-Independent Analysis of Parallel Algorithms , 1990, SIAM J. Comput..