G-TwYST harmonisation of statistical methods for use of omics data in food safety assessment

The G-TwYST project mainly focused on the statistical analysis of results from animal studies where a limited number of variables based on OECD guidance was measured. In the statistical analysis of these measurements, emphasis was placed on the possibilities to test the equivalence of the GM groups relative to historical non-GM groups from the GRACE study. A new univariate statistical approach was proposed for this purpose. In G-TwYST, only limited attention was paid to food safety assessment using the plant material that was used as feed in the animal studies, even though such material can be easily obtained. Especially high-throughput untargeted omics measurements (e.g. metabolomics or transcriptomics) of this material could be an excellent approach for checks on unintended effects due to the larger number of compounds (e.g. metabolites, transcripts) that are simultaneously measured. In G-TwYST a limited number of samples from the maize harvests has been analysed by transcriptomics and metabolomics. Subsequently, an already available multivariate one-class model was used to compare the G-TwYST samples to a reference set of measurements, as had already been done before in the GRACE project. The pilot studies in GRACE and G-TwYST highlight the potential of omics-based for food safety assessment. However, there is a need to study the statistical properties of the applied multivariate one-class model and among a range of alternative models it is unclear whether it is the most appropriate method for omics-based food safety assessment. Additionally, the applied approach is distinctly different from the univariate methodology that was developed in G-TwYST for animal studies. There is no reason why statistical methodology should be different for similar data from plant or animal studies. From a regulator’s point of view the statistical criteria for evaluating food safety data should as much as possible be the same. However, statistical methods will need to be different for highly multivariate omics data and for the more traditional univariate measurements. The main purpose of this short report is therefore to identify the differences between statistical approaches for food safety assessment as were applied in the context of the G-TwYST project, and to suggest directions for future research to improve the harmonisation of statistical methodology for analysing omics data.

[1]  H. van der Voet Safety Assessments and Multiplicity Adjustment: Comments on a Recent Paper , 2018, Journal of agricultural and food chemistry.

[2]  D G Altman,et al.  Statistics notes: Absence of evidence is not evidence of absence , 1995 .

[3]  N. Meinshausen Hierarchical testing of variable importance , 2008 .

[4]  S. Joe Qin,et al.  Statistical process monitoring: basics and beyond , 2003 .

[5]  S. Wellek,et al.  Multivariate Equivalence Tests for Use in Pharmaceutical Development , 2015, Journal of biopharmaceutical statistics.

[6]  A. Dempster A significance test for the separation of two highly multivariate small samples , 1960 .

[7]  Weidong Liu,et al.  Two‐sample test of high dimensional means under dependence , 2014 .

[8]  D. Lakens Equivalence Tests , 2017, Social psychological and personality science.

[9]  Matthew D. M. Pawley,et al.  Improving the detection of unusual observations in high‐dimensional settings , 2017 .

[10]  Age K. Smilde,et al.  UvA-DARE ( Digital Academic Repository ) Assessment of PLSDA cross validation , 2008 .

[11]  R. Berger,et al.  Bioequivalence trials, intersection-union tests and equivalence confidence sets , 1996 .

[12]  Z. Bai,et al.  EFFECT OF HIGH DIMENSION: BY AN EXAMPLE OF A TWO SAMPLE PROBLEM , 1999 .

[13]  Lutgarde M. C. Buydens,et al.  An overview of large‐dimensional covariance and precision matrix estimators with applications in chemometrics , 2017 .

[14]  Hilko van der Voet,et al.  Safety assessment of plant varieties using transcriptomics profiling and a one-class classifier. , 2014, Regulatory toxicology and pharmacology : RTP.

[15]  D. Paul,et al.  Random matrix theory in statistics: A review , 2014 .

[16]  Olivier Ledoit,et al.  Optimal Estimation of a Large-Dimensional Covariance Matrix Under Stein's Loss , 2017, Bernoulli.

[17]  Eun Sug Park,et al.  Comparing a new algorithm with the classic methods for estimating the number of factors , 1999 .

[18]  Hilko van der Voet,et al.  Equivalence testing using existing reference data: An example with genetically modified and conventional crops in animal feeding studies. , 2017, Food and chemical toxicology : an international journal published for the British Industrial Biological Research Association.