The use of Benford's law for evaluation of quality of occupational hygiene data.

Benford's law is the contra-intuitive empirical observation that the digits 1-9 are not equally likely to appear as the initial digit in numbers resulting from the same phenomenon. Manipulated, unrelated, or created numbers usually do not follow Benford's law, and as such this law has been used in the investigation of fraudulent data in, for example, accounting and to identify errors in data sets due to, for example, data transfer. We describe the use of Benford's law to screen occupational hygiene measurement data sets using exposure data from the European rubber manufacturing industry as an illustration. Two rubber process dust measurement data sets added to the European Union ExAsRub project but initially collected by the UK Health and Safety Executive (HSE) and British Rubber Manufacturers' Association (BRMA) and one pre- and one post-treatment n-nitrosamines data set collated in the German MEGA database and also added to the ExAsRub database were compared with the expected first-digit (1BL) and second-digit (2BL) Benford distributions. Evaluation indicated only small deviations from the expected 1BL and 2BL distributions for the data sets collated by the UK HSE and industry (BRMA), respectively, while for the MEGA data larger deviations were observed. To a large extent the latter could be attributed to imputation and replacement by a constant of n-nitrosamine measurements below the limit of detection, but further evaluation of these data to determine why other deviations from 1BL and 2BL expected distributions exist may be beneficial. Benford's law is a straightforward and easy-to-implement analytical tool to evaluate the quality of occupational hygiene data sets, and as such can be used to detect potential problems in large data sets that may be caused by malcontent a priori or a posteriori manipulation of data sets and by issues like treatment of observations below the limit of detection, rounding and transfer of data.

[1]  Hans Kromhout,et al.  A database of exposures in the rubber manufacturing industry: design and quality control. , 2005, The Annals of occupational hygiene.

[2]  K. Straif,et al.  Exposure to inhalable dust and its cyclohexane soluble fraction since the 1970s in the rubber manufacturing industry in the European Union , 2007, Occupational and Environmental Medicine.

[3]  Cindy Durtschi,et al.  The effective use of Benford's Law to assist in detecting fraud in accounting data , 2004 .

[4]  Anton K. Formann,et al.  The Newcomb-Benford Law in Its Relation to Some Common Distributions , 2010, PloS one.

[5]  D K Burns,et al.  The HSE National Exposure Database--(NEDB). , 1989, The Annals of occupational hygiene.

[6]  H. Kromhout,et al.  Exposure to rubber process dust and fume since 1970s in the United Kingdom; influence of origin of measurement data. , 2010, Journal of environmental monitoring : JEM.

[7]  G Cox,et al.  Exposure to rubber fume and rubber process dust in the general rubber goods, tyre manufacturing and retread industries. , 2000, The Annals of occupational hygiene.

[8]  K. Straif,et al.  Occupational exposure to NDMA and NMor in the European rubber industry. , 2007, Journal of environmental monitoring : JEM.

[9]  R Stamm,et al.  MEGA-database: one million data since 1972. , 2001, Applied occupational and environmental hygiene.

[10]  Mark J. Nigrini,et al.  I've Got Your Number , 1999 .

[11]  J. Straughan Cancer risk in the rubber industry: a review of recent epidemiological evidence. , 1998, Occupational and environmental medicine.

[12]  T. Hill A Statistical Derivation of the Significant-Digit Law , 1995 .

[13]  Richard J C Brown,et al.  Benford's Law and the screening of analytical data: the case of pollutant concentrations in ambient air. , 2005, The Analyst.

[14]  Stefan Gabriel,et al.  The BG Measurement System for Hazardous Substances (BGMG) and the Exposure Database of Hazardous Substances (MEGA) , 2006, International journal of occupational safety and ergonomics : JOSE.

[15]  P. Hewett,et al.  A comparison of several methods for analyzing censored data. , 2007, The Annals of occupational hygiene.

[16]  David While,et al.  Trends in wood dust inhalation exposure in the UK, 1985-2005. , 2009, The Annals of occupational hygiene.

[17]  H. Kromhout,et al.  Temporal trends of flour dust exposure in the United Kingdom, 1985-2003. , 2009, Journal of environmental monitoring : JEM.

[18]  M. Nigrini,et al.  The Use of Benford's Law as an Aid in Analytical Procedures , 1997 .

[19]  R. Hornung,et al.  Estimation of Average Concentration in the Presence of Nondetectable Values , 1990 .

[20]  N. Jewell Statistics for Epidemiology , 2003 .