A note on spectral data simulation

Abstract In chemometrics, it is common to simulate data to test new methods. However, it is difficult to find an article that only discusses the spectral data simulation in a global context. Most of the time, the simulation is performed specifically for one method. In this context, it is often difficult to understand the simulation choices and also to carry out a simulation adapted to the problem that one wishes to highlight. A generic simulation framework would allow a better understanding of the simulations carried out and also make them easier to carry out. In this article, a generic framework is proposed to simulate databases representing the problem that one wishes to simulate and facilitating the description of the simulation procedure. This method of simulation is based on the basic principles of chemometrics and allows a simple and fast simulation of data. This will be highlighted by three examples.

[1]  T. Næs,et al.  A comparison of methods for analysing regression models with both spectral and designed variables , 2004 .

[2]  Jean-Michel Roger,et al.  A review of orthogonal projections for calibration , 2018, Journal of Chemometrics.

[3]  Michael C. Denham,et al.  Choosing the number of factors in partial least squares regression: estimating and minimizing the mean squared error­ of prediction , 2000 .

[4]  S. Wold,et al.  PLS-regression: a basic tool of chemometrics , 2001 .

[5]  Emil W. Ciurczak,et al.  Handbook of Near-Infrared Analysis , 1992 .

[6]  J. Friedman,et al.  A Statistical View of Some Chemometrics Regression Tools , 1993 .

[7]  L. S. Nelson,et al.  The Folded Normal Distribution , 1961 .

[8]  Inge S. Helland,et al.  simrel — A versatile tool for linear model data simulation based on the concept of a relevant subspace and relevant predictors , 2015 .

[9]  Rasmus Bro,et al.  Standard error of prediction for multiway PLS 1 : background and a simulation study , 2002 .

[10]  D. Jouan-Rimbaud Bouveresse,et al.  Independent components analysis with the JADE algorithm , 2012 .

[11]  I. Helland,et al.  Comparison of Prediction Methods when Only a Few Components are Relevant , 1994 .

[12]  Paul Geladi,et al.  Principal Component Analysis , 1987, Comprehensive Chemometrics.

[13]  G. Marosi,et al.  Comparison of chemometric methods in the analysis of pharmaceuticals with hyperspectral Raman imaging , 2011 .

[14]  Rasmus Bro,et al.  Variable selection in multi-block regression , 2016 .

[15]  R. Tauler,et al.  Application of multivariate curve resolution alternating least squares (MCR-ALS) to the quantitative analysis of pharmaceutical and agricultural samples. , 2008, Talanta.