On the role of Shannon's entropy as a measure of heterogeneity

Ž . Ibanez et al. 1998 proposed the use of Shannon’s entropy to analyze the ̃ diversity of the world pedosphere on the basis of data compiled by the F.A.O. at the scale 1:5,000,000. Here we will try to provide some mathematically founded arguments to justify the use and interpretation of Shannon’s information entropy as a measure of diversity and homogeneity. Information entropy h is computed from a discrete probability distribution 4 p : is1, 2, . . . , N via Shannon’s formula hsyÝ p log p . This quantity i i i i was originally proposed by Shannon as a measure of the average information content that is gained from observing the realization of an experiment with N possible outcomes with probabilities of occurence given by p , p , . . . , p . 1 2 N Ž . Well-known mathematical facts are that a h attains its maximum value log N Ž . only in the equiprobable case, that is p s1rN for all i’s, and b h vanishes in i Ž . the case that some p s1 and thus p s0 if i/ j . These two extreme j i Ž . Ž . situations, respectively, correspond with a the most informative case and b Ž . the least informative case, since observing the actual outcome provides a much Ž . rich information being all outcomes equally probable and b very poor information in the case that outcome j has all the chances to occur. Also, the number h depends continuously on the probabilities p so that similar distributions render i close values of h.