Servet: A benchmark suite for autotuning on multicore clusters
暂无分享,去创建一个
Juan Touriño | Basilio B. Fraguela | María J. Martín | Jorge González-Domínguez | Guillermo L. Taboada | G. L. Taboada | J. González-Domínguez | B. Fraguela | J. Touriño | María J. Martín
[1] Wenguang Chen,et al. MPIPP: an automatic profile-guided parallel process placement toolset for SMP clusters and multiclusters , 2006, ICS '06.
[2] Ramesh Subramonian,et al. LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.
[3] K. Yotov,et al. X-ray: a tool for automatic measurement of hardware parameters , 2005, Second International Conference on the Quantitative Evaluation of Systems (QEST'05).
[4] Steven G. Johnson,et al. The Design and Implementation of FFTW3 , 2005, Proceedings of the IEEE.
[5] Dhabaleswar K. Panda,et al. Fast collective operations using shared and remote memory access protocols on clusters , 2003, Proceedings International Parallel and Distributed Processing Symposium.
[6] Roger W. Hockney,et al. The Communication Challenge for MPP: Intel Paragon and Meiko CS-2 , 1994, Parallel Computing.
[7] Basilio B. Fraguela,et al. Automatic Tuning of Discrete Fourier Transforms Driven by Analytical Modeling , 2009, 2009 18th International Conference on Parallel Architectures and Compilation Techniques.
[8] David A. Patterson,et al. Computer Architecture: A Quantitative Approach , 1969 .
[9] Gang Ren,et al. Is Search Really Necessary to Generate High-Performance BLAS? , 2005, Proceedings of the IEEE.
[10] Alan Jay Smith,et al. Measuring Cache and TLB Performance and Their Effect on Benchmark Runtimes , 1995, IEEE Trans. Computers.
[11] Guillaume Mercier,et al. Towards an Efficient Process Placement Policy for MPI Applications in Multicore Environments , 2009, PVM/MPI.
[12] Keshav Pingali,et al. Automatic measurement of memory hierarchy parameters , 2005, SIGMETRICS '05.
[13] Jack J. Dongarra,et al. Automated empirical optimizations of software and the ATLAS project , 2001, Parallel Comput..
[14] W HockneyRoger. The communication challenge for MPP , 1994 .
[15] S. Sistare,et al. Optimization of MPI Collectives on Clusters of Large-Scale SMPs , 1999, ACM/IEEE SC 1999 Conference (SC'99).
[16] Franz Franchetti,et al. SPIRAL: Code Generation for DSP Transforms , 2005, Proceedings of the IEEE.
[17] D. Padua,et al. P-Ray : A Suite of Micro-benchmarks for Multi-core Architectures ⋆ , 2008 .
[18] Jesper Larsson Träff,et al. The Hierarchical Factor Algorithm for All-to-All Communication (Research Note) , 2002, Euro-Par.
[19] Steve Sistare,et al. Optimization of MPI Collectives on Clusters of Large-Scale SMP's , 1999, SC.
[20] Yuefan Deng,et al. New trends in high performance computing , 2001, Parallel Computing.
[21] Juan Touriño,et al. Performance analysis of message-passing libraries on high-speed clusters , 2010, Comput. Syst. Sci. Eng..