Tuning iteration space slicing based tiled multi-core code implementing Nussinov’s RNA folding

BackgroundRNA folding is an ongoing compute-intensive task of bioinformatics. Parallelization and improving code locality for this kind of algorithms is one of the most relevant areas in computational biology. Fortunately, RNA secondary structure approaches, such as Nussinov’s recurrence, involve mathematical operations over affine control loops whose iteration space can be represented by the polyhedral model. This allows us to apply powerful polyhedral compilation techniques based on the transitive closure of dependence graphs to generate parallel tiled code implementing Nussinov’s RNA folding. Such techniques are within the iteration space slicing framework – the transitive dependences are applied to the statement instances of interest to produce valid tiles. The main problem at generating parallel tiled code is defining a proper tile size and tile dimension which impact parallelism degree and code locality.ResultsTo choose the best tile size and tile dimension, we first construct parallel parametric tiled code (parameters are variables defining tile size). With this purpose, we first generate two nonparametric tiled codes with different fixed tile sizes but with the same code structure and then derive a general affine model, which describes all integer factors available in expressions of those codes. Using this model and known integer factors present in the mentioned expressions (they define the left-hand side of the model), we find unknown integers in this model for each integer factor available in the same fixed tiled code position and replace in this code expressions, including integer factors, with those including parameters. Then we use this parallel parametric tiled code to implement the well-known tile size selection (TSS) technique, which allows us to discover in a given search space the best tile size and tile dimension maximizing target code performance.ConclusionsFor a given search space, the presented approach allows us to choose the best tile size and tile dimension in parallel tiled code implementing Nussinov’s RNA folding. Experimental results, received on modern Intel multi-core processors, demonstrate that this code outperforms known closely related implementations when the length of RNA strands is bigger than 2500.

[1]  William Pugh,et al.  Iteration Space Slicing for Locality , 1999, LCPC.

[2]  Wlodzimierz Bielecki,et al.  Using basis dependence distance vectors in the modified Floyd–Warshall algorithm , 2015, J. Comb. Optim..

[3]  Martin Griebl,et al.  Automatic Parallelization of Loop Programs for Distributed Memory Architectures , 2004 .

[4]  Jingling Xue,et al.  Loop Tiling for Parallelism , 2000, Kluwer International Series in Engineering and Computer Science.

[5]  J. Ramanujam,et al.  DynTile: Parametric tiled loop generation for parallel execution on multicore processors , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[6]  Sartaj Sahni,et al.  Multicore and GPU algorithms for Nussinov RNA folding , 2013, BMC Bioinformatics.

[7]  Marek Palkowski,et al.  Tiling arbitrarily nested loops by means of the transitive , 2016, Int. J. Appl. Math. Comput. Sci..

[8]  Sven Verdoolaege Counting Affine Calculator and Applications , 2011 .

[9]  Yunlong Liu,et al.  Identification of genes and pathways involved in kidney renal clear cell carcinoma , 2014, BMC Bioinformatics.

[10]  J. Ramanujam,et al.  Parameterized tiling revisited , 2010, CGO '10.

[11]  Albert Cohen,et al.  Polyhedral AST Generation Is More Than Scanning Polyhedra , 2015, ACM Trans. Program. Lang. Syst..

[12]  Tomofumi Yuki,et al.  Automatic creation of tile size selection models , 2010, CGO '10.

[13]  Albert Cohen,et al.  PrimeTile: A Parametric Multi-Level Tiler for Imperfect Loop Nests , 2009 .

[14]  Uday Bondhugula,et al.  Tiling for Dynamic Scheduling , 2014 .

[15]  Christophe Alias,et al.  Mono-parametric Tiling is a Polyhedral Transformation , 2015 .

[16]  Marek Palkowski,et al.  Parallel tiled Nussinov RNA folding loop nest generated using both dependence graph transitive closure and loop skewing , 2017, BMC Bioinformatics.

[17]  Sriram Krishnamoorthy,et al.  Parametric multi-level tiling of imperfectly nested loops , 2009, ICS.

[18]  William Pugh,et al.  An Exact Method for Analysis of Value-based Array Data Dependences , 1993, LCPC.

[19]  Marek Palkowski,et al.  TRACO: Source-to-Source Parallelizing Compiler , 2016, Comput. Informatics.

[20]  Ming Ouyang,et al.  Accelerating the Nussinov RNA folding algorithm with CUDA/GPU , 2010, The 10th IEEE International Symposium on Signal Processing and Information Technology.

[21]  Jerrold R. Griggs,et al.  Algorithms for Loop Matchings , 1978 .

[22]  Michael Zuker,et al.  Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information , 1981, Nucleic Acids Res..

[23]  Uday Bondhugula,et al.  A practical automatic polyhedral parallelizer and locality optimizer , 2008, PLDI '08.

[24]  David Wonnacott,et al.  Automatic Tiling of “ Mostly-Tileable ” Loop Nests , 2014 .