Cryptographic hash functions based on ALife

There is a long history of cryptographic hash functions, i.e. functions mapping variable-length strings to fixed-length strings, and such functions are also expected to enjoy certain security properties. Hash functions can be effected via modular arithmetic, permutationbased schemes, chaotic mixing, and so on. Herein we introduce the notion of an artificial-life (ALife) hash function (ALHF), whereby the requisite mixing action of a good hash function is accomplished via ALife rules that give rise to complex evolution of a given system. Various security tests have been run, and the results reported for examples of ALHFs. 1 Brief history of hash function design A hash function H maps arbitrary messages (bitstrings) called keys or pre-images into fixed-length bitstrings called hash values (the definitive treatment of hash functions is [Knuth 1998]). By the nomenclature H(κ) = ν we mean that κ has some number of bits, say m bits, and ν has n bits. Usually, m > n, so that hash values are “compressions” of the corresponding keys. We, however, do not assume that m > n. The notion of arbitrary message bitlength for κ is—if one so desires—easily reduced to the more convenient notion of m,n, via the simple observation that a long message may be split up into blocks of m bits each, with one block possibly zero-padded, and so on. References for descriptions of hash function characteristics, thorough hash function nomenclature, and analysis are [Merkle 1979], [Preneel 1993], and [Preneel 1994]. Various modern hash functions in actual use have somewhat arbitrary foundations, with rigorous security analysis almost always nontrivial. In cryptography, a hash function might appear to be cleverly constructed to “mix up” the m bits to render a smaller number n of bits, and yet there are often various security weaknesses of which a good hash function should be devoid. For example, the hash function defined H(κ) = ∏ j pj(κ)  mod 65536 Mark A. Bedau, Richard Crandall, and Michael J. Raven, "Cryptographic hash functions based on ALife," PSIpress, 15 nov 09: A, http://www.perfscipress.com/papers/crypto9_psipress.pdf where pj(x) is the position of the j-th “1” in x, is always a 16-bit value because of the mod, yet this H is terribly insecure. Imagine a password file having one 16-bit hash value for each of some 60000 users (each user’s typed password is some κ, and gets mapped), whence many of the hash values “collide”—in fact, a long enough typed password is likely to have enough even positions of its 1’s to render the hash value 0! For an informal overview of the main kinds of modern hash functions, see [Schneier 1996]; a more rigorous overview can be found in [Menezes et al. 1997]. We authors are of the belief that statistical tests, such as the avalanche test (see [Feistel 1973] and [Menezes et al. 1997]) and the collision test (see [Menezes et al. 1997], [Schneier 1996], and [Stinson 1995]), are the best approach for assessing hash function security. In spite of this, we do appreciate the conceptual and aesthetic approaches to hash functions. In the present treatment, we attempt to spring from the intuitive understanding that evolving systems can be rife with complexity, at many levels, and use such intuition to create hash functions. To this end, we adapt evolving computational systems as studied in Artificial Life to create an artificial-life hash function (ALHF), whereby the requisite mixing action of a good hash function is accomplished via the sort of rules that give rise to complex evolution in standard Artificial Life systems. 2 Cryptographic security tests for hash functions There are several typical ways to test the security of cryptographic hash functions. We tested the security of the ALHF against the so-called Avalanche Test and Collision Test. Avalanche Test. Consider a hash function H operating on m-bit input keys, κ1, κ2, ..., and producing n-bit hash values, ν1, ν2, .... Let H(κ1) = ν1 and H(κ2) = ν2, where κ1 6= κ2 and ν1 6= ν2. One way in which H can be insecure is if small changes in κ1 and κ2 result in predictable changes in ν1 and ν2. First, let | ν | be the bit-length of hash value ν. Let δinput(κ1, κ2) = κ1 ⊕ κ2 | H(κ1) | and let δoutput(ν1, ν2) = ν1 ⊕ ν2 | H(ν1) | . (We assume that | H(κ1) |=| H(κ2) | and that | H(ν1) |=| H(ν2) |.) One could predict ν1 and ν2 from κ1 and κ2 if both δinput(κ1, κ2) and δoutput(ν1, ν2) are small. The hardest case to avoid is where δinput(κ1, κ2) = 1 m . In this case, one wants δoutput(ν1, ν2) = 12 (this roughly amounts to satisfying the strict avalanche criterion introduced and discussed in [Webster 1985] and [Webster and Tavares 1986], according to which if a single bit in the key is complemented, then there is a one half probability that the hash value is complemented). If this can be achieved, the cryptographic hash function achieves avalanche and the Avalanche Test is passed. See [Feistel 1973] for an early description of avalanches in cryptography. For recent developments see [Seberry et al. 1994], [Seberry et al. 1995], and [Zhang and Zheng 1995]. Collision Test. A collision is said to occur when two distinct input keys κ1 and κ2 are such that H(κ1) = H(κ2). Any secure cryptographic hash function should rarely produce collisions. One good way of testing how well a given cryptographic hash function performs with respect to collisions is by means of the so-called Birthday Attack. 253 people must be in the same room as you if the probability that someone in the room shares your birthday is greater than chance. What is prima facie puzzling is that there must only be 23 people in the same room for the probability that any two of them share birthdays to be greater than chance. The cryptographic application of the Birthday Attack is clear: it is much easier to find two m-bit random input keys κ1 and κ2 such that H(κ1) = H(κ2) (where H(x) is n-bits long) than it is to find a κ2 such that H(κ1) = H(κ2) given a κ1. Given a κ1, computing hash values for 2 random κ2’s is necessary to get a match. However, if one only wishes to find any two matching random κ1’s and κ2’s, then computing hash values for only 2 random κ1’s and κ2’s are necessary. See [Schneier 1996] and [Stinson 1995] for more on the Birthday Attack; [van Oorschot and Wiener 1994] and [Yuval 1979] discuss how to carry out a Birthday Attack. A secure cryptographic hash function should perform accordingly. The Birthday Attack Test is passed if the cryptographic hash function does not yield any matching hash values for less than 2 random input keys, and after then, yields them as statistically expected. This is generally achieved if the bit length of the ciphertext is sufficiently long; see [Beth et al. 1992] for estimates on appropriate lengths for various tasks. 3 Artificial life—background Artificial life (also known as “ALife) is an interdisciplinary study of life and lifelike processes that uses a synthetic methodology. Artificial life has three broad branches, corresponding to three different synthetic methods. “Soft artificial life creates simulations or other computational systems that exhibit life-like behavior, “hard artificial life produces hardware implementations of life-like systems, and “wet artificial life synthesized living systems in biochemical media. The general goals of artificial life include understanding and creating life and life-like systems, and developing practical devices inspired by living systems. The main open questions in artificial life involve determining how life arises from non-life, determining the potentials and limits of living systems, and determining how life is connected to mind, machines, and culture [Bedau et al. 2000]. The American computer scientist Christopher Langton coined the phrase “artificial life in 1987, when he organized the first scientific conference explicitly devoted to this field [Langton 1989]. Before there were artificial life conferences, the simulation and synthesis of life-like systems occurred in isolated pockets scattered across a variety of disciplines. The Hungarian-born mathematician and physicist John von Neumann created the first artificial life model (without referring to it as such) in the 1940s when he produced a self-reproducing, computation-universal entity using cellular automata [von Neumann 1966]. Rather than modeling some existing living system, many artificial life systems are intended to generate wholly newand typically extremely simpleinstances of life-like phenomena. The simplest example of such a system is the so-called “Game of Life devised by the British mathematician John Conway [Berlekamp et al. 1982] in the 1960s, before the field of artificial life was conceived. Perhaps the most famous recent artificial life system Tierra, designed by the American biologist Tom Ray [Ray 1992]. Tierra consists of a population of self-replicating computer programs populating computer memory and consuming CPU time. The system is initialized when a single (human-designed) self-replicating program, the ancestor, is placed in computer memory and left alone to selfreplicate. The ancestor and its descendants repeatedly replicate until memory is teeming with self-replicating programs. Errors (mutations) sometimes occur, so the population of Tierra program evolves by natural selection. If a mutation allows a program to replicate faster, that type of program tends to spread through the population. Over time, the ecology of Tierran programs becomes remarkably diverse. Quickly reproducing parasites that exploit a hosts genetic code evolve, and this spurs the evolution of new programs that resist the parasites. After millions of CPU cycles, Tierra typically contains many kinds of programs exhibiting a variety of competitive and cooperative ecological relationships. Artificial life is similar to artificial intelligence (AI), both because study natural phenomena through computational models and because natural intelligent and living systems tend to coincide.

[1]  Bruce Schneier,et al.  Applied cryptography : protocols, algorithms, and source codein C , 1996 .

[2]  H. Feistel Cryptography and Computer Privacy , 1973 .

[3]  Yuliang Zheng,et al.  GAC - the Criterion for Global Avalance Characteristics of Cryptographic Functions , 1995, J. Univers. Comput. Sci..

[4]  Stafford E. Tavares,et al.  On the Design of S-Boxes , 1985, CRYPTO.

[5]  Alfred Menezes,et al.  Handbook of Applied Cryptography , 2018 .

[6]  Praveen Gauravaram,et al.  Cryptographic Hash Functions , 2010, Encyclopedia of Information Assurance.

[7]  Kaoru Kurosawa,et al.  Towards Secure and Fast Hash Functions , 1998 .

[8]  Christopher G. Langton Artificial life : the proceedings of an Interdisciplinary Workshop on the Synthesis and Simulation of Living Systems held September, 1987, in Los Alamos, New Mexico , 1989 .

[9]  Ralph C. Merkle,et al.  One Way Hash Functions and DES , 1989, CRYPTO.

[10]  C. Pomerance,et al.  Prime Numbers: A Computational Perspective , 2002 .

[11]  Gideon Yuval,et al.  How to Swindle Rabin , 1979, Cryptologia.

[12]  Hugo Krawczyk,et al.  Keying Hash Functions for Message Authentication , 1996, CRYPTO.

[13]  John S. McCaskill,et al.  Open Problems in Artificial Life , 2000, Artificial Life.

[14]  Jennifer Seberry,et al.  Structures of Cryptographic Functions with Strong Avalanche Characteristics (Extended Abstract) , 1994, ASIACRYPT.

[15]  David Thomas,et al.  The Art in Computer Programming , 2001 .

[16]  Thomas Beth,et al.  Public-Key Cryptography: State of the Art and Future Directions , 1992, Lecture Notes in Computer Science.

[17]  Bart Preneel,et al.  Attacks on Fast Double Block Length Hash Functions , 1998, Journal of Cryptology.

[18]  Pieter Retief Kasselman,et al.  Analysis and design of cryptographic hash functions , 1999 .

[19]  Douglas R. Stinson,et al.  Cryptography: Theory and Practice , 1995 .

[20]  A. Blokhuis Winning ways for your mathematical plays , 1984 .

[21]  Jennifer Seberry,et al.  Improving the Strict Avalanche Characteristics of Cryptographic Functions , 1994, Inf. Process. Lett..

[22]  Ralph C. Merkle,et al.  Secrecy, authentication, and public key systems , 1979 .

[23]  John von Neumann,et al.  Theory Of Self Reproducing Automata , 1967 .

[24]  Mark A. Bedau,et al.  A Generic Neutral Model for Quantitative Comparison of Genotypic Evolutionary Activity , 1999, ECAL.

[25]  Paul C. van Oorschot,et al.  Parallel collision search with application to hash functions and discrete logarithms , 1994, CCS '94.