Bandit-based optimization on graphs with application to library performance tuning

The problem of choosing fast implementations for a class of recursive algorithms such as the fast Fourier transforms can be formulated as an optimization problem over the language generated by a suitably defined grammar. We propose a novel algorithm that solves this problem by reducing it to maximizing an objective function over the sinks of a directed acyclic graph. This algorithm valuates nodes using Monte-Carlo and grows a subgraph in the most promising directions by considering local maximum k-armed bandits. When used inside an adaptive linear transform library, it cuts down the search time by an order of magnitude compared to the existing algorithm. In some cases, the performance of the implementations found is also increased by up to 10% which is of considerable practical importance since it consequently improves the performance of all applications using the library.

[1]  Manuela M. Veloso,et al.  Learning to Construct Fast Signal Processing Implementations , 2002, J. Mach. Learn. Res..

[2]  Franz Franchetti,et al.  A Rewriting System for the Vectorization of Signal Transforms , 2006, VECPAR.

[3]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[4]  R. C. Whaley,et al.  Minimizing development and maintenance costs in supporting persistently optimized BLAS , 2005, Softw. Pract. Exp..

[5]  Steven G. Johnson,et al.  The Design and Implementation of FFTW3 , 2005, Proceedings of the IEEE.

[6]  T. L. Lai Andherbertrobbins Asymptotically Efficient Adaptive Allocation Rules , 1985 .

[7]  Markus Püschel,et al.  Computer Generation of General Size Linear Transform Libraries , 2009, 2009 International Symposium on Code Generation and Optimization.

[8]  Markus Püschel,et al.  In search of the optimal Walsh-Hadamard transform , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[9]  Dragan Mirkovic,et al.  An adaptive software library for fast Fourier transforms , 2000, ICS '00.

[10]  Sylvain Gelly,et al.  Modifications of UCT and sequence-like simulations for Monte-Carlo Go , 2007, 2007 IEEE Symposium on Computational Intelligence and Games.

[11]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[12]  Jos W. H. M. Uiterwijk,et al.  Monte-Carlo tree search in production management problems , 2006 .

[13]  Yevgen Voronenko,et al.  Library generation for linear transforms , 2008 .

[14]  Stephen F. Smith,et al.  A Simple Distribution-Free Approach to the Max k-Armed Bandit Problem , 2006, CP.

[15]  Franz Franchetti,et al.  SPIRAL: Code Generation for DSP Transforms , 2005, Proceedings of the IEEE.

[16]  M. Puschel,et al.  FFT Program Generation for Shared Memory: SMP and Multicore , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[17]  Stephen F. Smith,et al.  The Max K-Armed Bandit: A New Model of Exploration Applied to Search Heuristic Selection , 2005, AAAI.

[18]  Rémi Coulom,et al.  Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.

[19]  Csaba Szepesvári,et al.  Tuning Bandit Algorithms in Stochastic Environments , 2007, ALT.

[20]  Katherine Yelick,et al.  OSKI: A library of automatically tuned sparse matrix kernels , 2005 .

[21]  David Silver,et al.  Combining online and offline knowledge in UCT , 2007, ICML '07.