W e're accustomed to hearing about the unreasonable effectiveness of mathematics, delightful—and unex-pected—applications of theory to the real world. In the world of the In-ternet, we've seen it in the use of number theory in public-key cryptography (the Diffie-Hellman system , the RSA algorithm, elliptic curve cryptosys-tems), in the utilization of graph theory in network design. In the world of Internet data security, currently we face the opposite situation: a problem in search of mathematical theory. The problem is hash functions. A hash function is an easy-to-compute compression function that takes a variable-length input and converts it to a fixed-length output. The hashes in which we are interested, called cryptographic hash functions, are " one-way " , which is to say, they should be easy to compute and " hard " , or compu-tationally expensive, to invert 1. Hash functions are used as a compact representation of a longer piece of data—a digital fingerprint—and to provide message integrity. The way hashes are used to provide integrity is that the hash value of a particular piece of data, h 0 , is computed at an initial time t 0. When the data needs to be used later at time t 1 , the hash, h 1 , is recomputed. If the two hashes are equal, then the data has not been altered. Ralph Merkle, a co-inventor of public-key cryptography, calls hashes the " duct tape " of cryptography. Among other things, hashes are used to ascertain software integrity, in digital signatures, in message authentication, and as one-time passwords; they are employed in many Internet protocols including SSL/TLS, the transport-layer protocol that enables secure Web transactions, IPsec, and SSH. Because hash functions " shrink " data, collisions between hashes are inevitable. There are three fundamental properties that a cryptographic hash should satisfy: pre-image resistance (sometimes called non-invertibility): it should be computation-ally infeasible to find an input which hashes to a specified output, second pre-image resistance: it should be computationally infeasible to find a second input that hashes to the same output as a specified input, and collision resistance: it should be computationally infeasible to find two different inputs that hash to the same output. In 1979 Merkle [10, pp. 12–13] and Gideon Yuval [12] independently observed that because of the " birthday " paradox—the well-known result that in a group of twenty-three people, the probability that two people share the same birthday …
[1]
Antoon Bosselaers,et al.
Collisions for the Compressin Function of MD5
,
1994,
EUROCRYPT.
[2]
Ralph C. Merkle,et al.
A fast software one-way hash function
,
1990,
Journal of Cryptology.
[3]
Gideon Yuval,et al.
How to Swindle Rabin
,
1979,
Cryptologia.
[4]
Antoine Joux,et al.
Collisions in SHA-0
,
2004,
CRYPTO 2004.
[5]
Susan Landau,et al.
Polynomials in the Nation's Service: Using Algebra to Design the Advanced Encryption Standard
,
2004,
Am. Math. Mon..
[6]
Antoine Joux,et al.
Collisions of SHA-0 and Reduced SHA-1
,
2005,
EUROCRYPT.
[7]
Alfred Menezes,et al.
Handbook of Applied Cryptography
,
2018
.
[8]
Xiaoyun Wang,et al.
Finding Collisions in the Full SHA-1
,
2005,
CRYPTO.
[9]
Ivan Damgård,et al.
A Design Principle for Hash Functions
,
1989,
CRYPTO.
[10]
S. Landau.
Standing the Test of Time : The Data Encryption Standard
,
2000
.
[11]
F. P..
Secrecy
,
1994,
RES: Anthropology and Aesthetics.