Perfect diffusion primitives for block ciphers - building efficient MDS matrices

Although linear perfect diffusion primitives, i.e. MDS matrices, are widely used in block ciphers, e.g. AES, very little systematic work has been done on how to find ``efficient'' ones. In this paper we attempt to do so by considering software implementations on various platforms. These considerations lead to interesting combinatorial problems: how to maximize the number of occurrences of 1 in those matrices, and how to minimize the number of pairwise different entries. We investigate these problems and construct efficient $4\times4$ and $8\times8$ MDS matrices to be used e.g. in block ciphers.