A likelihood ratio model for the determination of the geographical origin of olive oil.

Food fraud or food adulteration may be of forensic interest for instance in the case of suspected deliberate mislabeling. On account of its potential health benefits and nutritional qualities, geographical origin determination of olive oil might be of special interest. The use of a likelihood ratio (LR) model has certain advantages in contrast to typical chemometric methods because the LR model takes into account the information about the sample rarity in a relevant population. Such properties are of particular interest to forensic scientists and therefore it has been the aim of this study to examine the issue of olive oil classification with the use of different LR models and their pertinence under selected data pre-processing methods (logarithm based data transformations) and feature selection technique. This was carried out on data describing 572 Italian olive oil samples characterised by the content of 8 fatty acids in the lipid fraction. Three classification problems related to three regions of Italy (South, North and Sardinia) have been considered with the use of LR models. The correct classification rate and empirical cross entropy were taken into account as a measure of performance of each model. The application of LR models in determining the geographical origin of olive oil has proven to be satisfactorily useful for the considered issues analysed in terms of many variants of data pre-processing since the rates of correct classifications were close to 100% and considerable reduction of information loss was observed. The work also presents a comparative study of the performance of the linear discriminant analysis in considered classification problems. An approach to the choice of the value of the smoothing parameter is highlighted for the kernel density estimation based LR models as well.

[1]  G. Zadora Classification of Glass Fragments Based on Elemental Composition and Refractive Index * , 2009, Journal of forensic sciences.

[2]  Silvia Lanteri,et al.  Classification of olive oils from their fatty acid composition , 1983 .

[3]  A. Rohman,et al.  The Use of FTIR Spectroscopy and Chemometrics for Rapid Authentication of Extra Virgin Olive Oil , 2014 .

[4]  H. D. Brunk,et al.  AN EMPIRICAL DISTRIBUTION FUNCTION FOR SAMPLING WITH INCOMPLETE INFORMATION , 1955 .

[5]  J. O. Cáceres,et al.  Application of Laser-Induced Breakdown Spectroscopy (LIBS) and Neural Networks to Olive Oils Analysis , 2013, Applied spectroscopy.

[6]  P. Hall,et al.  Bandwidth selection for the smoothing of distribution functions , 1998 .

[7]  Ahmed Rebai,et al.  An Overview of the Authentication of Olive Tree and Oil , 2013 .

[8]  M. Bouaziz,et al.  Effect of growing region on quality characteristics and phenolic compounds of chemlali extra-virgin olive oils , 2013, Acta Physiologiae Plantarum.

[9]  N Ogrinc,et al.  Authentication of vegetable oils by bulk and molecular carbon isotope analyses with emphasis on olive oil and pumpkin seed oil. , 2001, Journal of agricultural and food chemistry.

[10]  Chaoyin Chen,et al.  Rapid Quantitative Determination of Walnut Oil Adulteration with Sunflower Oil Using Fluorescence Spectroscopy , 2013, Food Analytical Methods.

[11]  J. Spink,et al.  Development and application of a database of food ingredient fraud and economically motivated adulteration from 1980 to 2010. , 2012, Journal of food science.

[12]  Hemant Ishwaran,et al.  Evaluating Random Forests for Survival Analysis using Prediction Error Curves. , 2012, Journal of statistical software.

[13]  D. G. Morrison On the Interpretation of Discriminant Analysis , 1969 .

[14]  Graciela Estévez-Pérez,et al.  Nonparametric Kernel Distribution Function Estimation with kerdiest: An R Package for Bandwidth Choice and Applications , 2012 .

[15]  Agnieszka Martyna,et al.  Wine authenticity verification as a forensic problem: an application of likelihood ratio test to label verification. , 2014, Food chemistry.

[16]  J. Webber,et al.  A bi-symmetric log transformation for wide-range data , 2013 .

[17]  Niko Brümmer,et al.  Application-independent evaluation of speaker detection , 2006, Comput. Speech Lang..

[18]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[19]  Zhengwei Chen,et al.  Authentication of Edible Vegetable Oil and Refined Recycled Cooking Oil Using a Micro-UV Spectrophotometer Based on Chemometrics , 2013 .

[20]  Eric R. Ziegel,et al.  Chemometrics: Statistics and Computer Application in Analytical Chemistry , 2001, Technometrics.

[21]  David J. Hand,et al.  Assessing the Performance of Classification Methods , 2012 .

[22]  Figen Tokatli,et al.  Classification of Turkish Extra Virgin Olive Oils by a SAW Detector Electronic Nose , 2011 .

[23]  Franco Taroni,et al.  Statistics and the Evaluation of Evidence for Forensic Scientists , 2004 .

[24]  Michael J. Best,et al.  Active set algorithms for isotonic regression; A unifying framework , 1990, Math. Program..

[25]  Hugo Thienpont,et al.  Visible and near-infrared absorption spectroscopy by an integrating sphere and optical fibers for quantifying and discriminating the adulteration of extra virgin olive oil from Tuscany , 2011, Analytical and bioanalytical chemistry.

[26]  Grzegorz Zadora,et al.  Information‐Theoretical Assessment of the Performance of Likelihood Ratio Computation Methods , 2013, Journal of forensic sciences.

[27]  Figen Tokatli,et al.  Phenolic Characterization and Geographical Classification of Commercial Extra Virgin Olive Oils Produced in Turkey , 2012 .

[28]  Agnieszka Martyna,et al.  Statistical Analysis in Forensic Science: Evidential Value of Multivariate Physicochemical Data , 2014 .

[29]  Grzegorz Zadora,et al.  Information-theoretical feature selection using data obtained by scanning electron microscopy coupled with and energy dispersive X-ray spectrometer for the classification of glass traces. , 2011, Analytica chimica acta.

[30]  C. D. Kemp,et al.  Density Estimation for Statistics and Data Analysis , 1987 .