Mapping efficiency and information content

Abstract This paper proposes two compound measures of mapping quality to support objective comparison of spatial prediction techniques for geostatistical mapping: (1) mapping efficiency – defined as the costs per area per amount of variation explained by the model, and (2) information production efficiency – defined as the cost per byte of effective information produced. These were inspired by concepts of complexity from mathematics and physics. Complexity i.e. the total effective information is defined as bytes remaining after compression and after rounding up the numbers using half the mapping accuracy (effective precision). It is postulated that the mapping efficiency, for an area of given size and limited budget, is basically a function of inspection intensity and mapping accuracy. Both measures are illustrated using the Meuse and Ebergotzen case studies ( gstat , plotKML packages). The results demonstrate that, for mapping organic matter (Meuse data set), there is a gain in the mapping efficiency when using regression-kriging versus ordinary kriging: mapping efficiency is 7% better and the information production efficiency about 25% better (3.99 vs 3.14 EUR B −1 for the GZIP compression algorithm). For mapping sand content (Ebergotzen data set), the mapping efficiency for both ordinary kriging and regression-kriging is about the same; the information production efficiency is 29% better for regression-kriging (37.1 vs 27.7 EUR B −1 for the GZIP compression algorithm). Information production efficiency is possibly a more robust measure of mapping quality than mapping efficiency because: (1) it is scale-independent, (2) it can be more easily related to the concept of effective information content, and (3) it accounts for the extrapolation effects. The limitation of deriving the information production efficiency is that both reliable estimate of the model uncertainty and the mapping accuracy is required.

[1]  Peter Finke,et al.  Chapter 39 Quality Assessment of Digital Soil Maps: Producers and Users Perspectives , 2004 .

[2]  Brett Whelan,et al.  Measuring the quality of digital soil maps using information criteria , 2001 .

[3]  Philippe Lagacherie,et al.  Digital soil mapping : an introductory perspective , 2007 .

[4]  B. Minasny,et al.  On digital soil mapping , 2003 .

[5]  Budiman Minasny,et al.  On digital soil mapping , 2003 .

[6]  Raul Toral,et al.  Parrondo's games and the zipping algorithm , 2004, SPIE International Symposium on Fluctuations and Noise.

[7]  N. Lam,et al.  On the Issues of Scale, Resolution, and Fractal Analysis in the Mapping Sciences* , 1992 .

[8]  Edzer J. Pebesma,et al.  Applied Spatial Data Analysis with R - Second Edition , 2008, Use R!.

[9]  A. Steina,et al.  Issues of scale for environmental indicators , 2001 .

[10]  Jack K. Wolf,et al.  New asymptotic bounds and improvements on the Lempel-Ziv data compression algorithm , 1991, IEEE Trans. Inf. Theory.

[11]  Derya Maktav,et al.  Information content of optical satellite images for topographic mapping , 2009 .

[12]  Gerard B. M. Heuvelink,et al.  Sampling for validation of digital soil maps , 2011 .

[13]  Guangqing Chi,et al.  Applied Spatial Data Analysis with R , 2015 .

[14]  Abraham Lempel,et al.  Compression of individual sequences via variable-rate coding , 1978, IEEE Trans. Inf. Theory.

[15]  A. Kolmogorov Three approaches to the quantitative definition of information , 1968 .

[16]  N. Scafetta,et al.  Compression and diffusion: a joint approach to detect complexity , 2002, cond-mat/0202123.

[17]  S. Bruin,et al.  Making the Trade-Off between Decision Quality and Information Cost , 2003 .

[18]  P. A. Burrough,et al.  THE RELATION BETWEEN COST AND UTILITY IN SOIL SURVEY (I–III)1 , 1971 .

[19]  J. Legros,et al.  Mapping of the soil , 2005 .

[20]  R. A. MacMillan,et al.  Predictive Ecosystem Mapping (PEM) for 8.2 Million ha of Forestland, British Columbia, Canada , 2010 .

[21]  Ross S. Lunetta,et al.  Remote Sensing and GIS Accuracy Assessment , 2007 .

[22]  C.E. Shannon,et al.  Communication in the Presence of Noise , 1949, Proceedings of the IRE.

[23]  P. A. Burrough,et al.  THE RELATION BETWEEN COST AND UTILITY IN SOIL SURVEY , 1971 .

[24]  Zhilin Li,et al.  Quantitative measures for spatial information of maps , 2002, Int. J. Geogr. Inf. Sci..

[25]  S. W. Bie,et al.  THE ECONOMIC VALUE OF SOIL SURVEY INFORMATION , 1972 .

[26]  A. Ulph,et al.  CALCULATING THE ECONOMIC BENEFITS OF SOIL SURVEY , 1973 .

[27]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part I , 1964, Inf. Control..

[28]  Gregory J. Chaitin,et al.  On the Simplicity and Speed of Programs for Computing Infinite Sets of Natural Numbers , 1969, J. ACM.

[29]  Bernard Muschielok,et al.  The 4MOST instrument concept overview , 2014, Astronomical Telescopes and Instrumentation.

[30]  Tomislav Hengl,et al.  Finding the right pixel size , 2006, Comput. Geosci..

[31]  Todd H. Skaggs,et al.  Estimating particle-size distribution from limited soil texture data , 2001 .

[32]  B. Kempen Updating soil information with digital soil mapping , 2011 .

[33]  Clemens Reimann,et al.  Statistical data analysis explained : applied environmental statics with R , 2008 .

[34]  D. Sheinwald,et al.  On the Ziv-Lempel proof and related topics , 1994, Proc. IEEE.

[35]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part II , 1964, Inf. Control..

[36]  Alfred E. Hartemink,et al.  Digital soil mapping: bridging research, environmental application, and operation , 2010 .