Polynomial-time Construction of Optimal Tree-structured Communication Data Layout Descriptions

We show that the problem of constructing tree-structured descriptions of data layouts that are optimal with respect to space or other criteria from given sequences of displacements, can be solved in polynomial time. The problem is relevant for efficient compiler and library support for communication of noncontiguous data, where tree-structured descriptions with low-degree nodes and small index arrays are beneficial for the communication soft- and hardware. An important example is the Message-Passing Interface (MPI) which has a mechanism for describing arbitrary data layouts as trees using a set of increasingly general constructors. Our algorithm shows that the so-called MPI datatype reconstruction problem by trees with the full set of MPI constructors can be solved optimally in polynomial time, refuting previous conjectures that the problem is NP-hard. Our algorithm can handle further, natural constructors, currently not found in MPI. Our algorithm is based on dynamic programming, and requires the solution of a series of shortest path problems on an incrementally built, directed, acyclic graph. The algorithm runs in $O(n^4)$ time steps and requires $O(n^2)$ space for input displacement sequences of length $n$.

[1]  Jesper Larsson Träff,et al.  Efficient, Optimal MPI Datatype Reconstruction for Vector and Index Types , 2015, EuroMPI.

[2]  Jesper Larsson Träff Optimal MPI Datatype Normalization for Vector and Index-block Types , 2014, EuroMPI/ASIA.

[3]  Robert A. van de Geijn,et al.  Elemental: A New Framework for Distributed Memory Dense Matrix Computations , 2013, TOMS.

[4]  Message Passing Interface Forum MPI: A message - passing interface standard , 1994 .

[5]  Torsten Hoefler,et al.  A Transformation to Convert Packing Code to Compact Datatypes for Efficient Zero-Copy Data Transfer , 2011 .

[6]  Torsten Hoefler,et al.  Performance Expectations and Guidelines for MPI Derived Datatypes , 2011, EuroMPI.

[7]  Hubert Ritzdorf,et al.  Flattening on the Fly: Efficient Handling of MPI Derived Datatypes , 1999, PVM/MPI.

[8]  Torsten Hoefler,et al.  Automatic datatype generation and optimization , 2012, PPoPP '12.

[9]  Torsten Hoefler,et al.  MPI datatype processing using runtime compilation , 2013, EuroMPI.

[10]  Robert Latham,et al.  Processing MPI Datatypes Outside MPI , 2009, PVM/MPI.

[11]  Robert B. Ross,et al.  Implementing Fast and Reusable Datatype Processing , 2003, PVM/MPI.

[12]  Jaeyoung Choi,et al.  Design and Implementation of the ScaLAPACK LU, QR, and Cholesky Factorization Routines , 1994, Sci. Program..

[13]  Xin-She Yang,et al.  Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.

[14]  Jesper Larsson Träff,et al.  Constructing MPI Input-output Datatypes for Efficient Transpacking , 2008, PVM/MPI.