Early Experiences for Adaptation of Auto-tuning by ppOpen-AT to an Explicit Method

We present a code optimization technique by adapting an auto-tuning (AT) function to an explicit method with the static code generator FIBER. The AT function is evaluated with current multicore processors to match situations with high-thread parallelism (HTP). The results of performance evaluations indicate that the AT function is crucial for HTP, as the speedups of the explicit method with a static code generator are as much as 7.4× compared to that of original implementations based on compiler optimization only.

[1]  Takahiro Katagiri,et al.  ABCLib_DRSSED: A parallel eigensolver with an auto-tuning facility , 2006, Parallel Comput..

[2]  Jeffrey S. Vetter,et al.  Autopilot: adaptive control of distributed applications , 1998, Proceedings. The Seventh International Symposium on High Performance Distributed Computing (Cat. No.98TB100244).

[3]  Jack J. Dongarra,et al.  Automated empirical optimizations of software and the ATLAS project , 2001, Parallel Comput..

[4]  Peter F. Sweeney,et al.  Multiple page size modeling and optimization , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).

[5]  Takahiro Katagiri,et al.  ABCLibScript: a directive to support specification of an auto-tuning facility for numerical software , 2006, Parallel Comput..

[6]  Steven G. Johnson,et al.  FFTW: an adaptive software architecture for the FFT , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[7]  Richard W. Vuduc,et al.  Sparsity: Optimization Framework for Sparse Matrix Kernels , 2004, Int. J. High Perform. Comput. Appl..

[8]  Ananta Tiwari,et al.  Online Adaptive Code Generation and Tuning , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[9]  Michael Voss,et al.  High-level adaptive program optimization with ADAPT , 2001, PPoPP '01.

[10]  Chun Chen,et al.  Improving High-Performance Sparse Libraries Using Compiler-Assisted Specialization: A PETSc Case Study , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum.

[11]  Takahiro Katagiri,et al.  FIBER: A Generalized Framework for Auto-tuning Software , 2003, ISHPC.

[12]  Daniel J. Quinlan,et al.  Semantic-Aware Automatic Parallelization of Modern Applications Using High-Level Abstractions , 2010, International Journal of Parallel Programming.

[13]  Katherine Yelick,et al.  OSKI: A library of automatically tuned sparse matrix kernels , 2005 .

[14]  Yuefan Deng,et al.  New trends in high performance computing , 2001, Parallel Computing.

[15]  Lawrence Rauchwerger,et al.  The R-LRPD test: speculative parallelization of partially parallel loops , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.