Perspective on essential information in multivariate curve resolution

Abstract We propose to take a new perspective on the construction and interpretation of multivariate curve resolution (MCR) models for the decomposition of spectral mixture data. We start by introducing archetypes, i.e. points that approximate the convex hull of a data cloud and correspond to the most linearly dissimilar observations. Identifying archetypes is a way to select essential samples (ESs) and essential variables (EVs) of a data matrix before MCR decomposition. Working with ESs and EVs, we then identify three main implications. The first is data reduction, which brings simplicity and computational speed. The second is prioritization, with the ESs and EVs profiles being the most dominant features to solve the MCR problem. The third is interpretability: the reduced data sets provide more direct insights and better understanding of final MCR models. Overall, the selection of ESs and EVs offers new opportunities that are worth being explored.

[1]  Klaus Neymeyr,et al.  On generalized Borgen plots. I: From convex to affine combinations and applications to spectral dataSpectra , 2015 .

[2]  Romà Tauler,et al.  Multivariate Curve Resolution (MCR). Solving the mixture analysis problem , 2014 .

[3]  Ingunn Burud,et al.  Fast Analysis, Processing and Modeling of Hyperspectral Videos: Challenges and Possible Solutions , 2020 .

[4]  Emma Brodrick,et al.  Data size reduction strategy for the classification of breath and air samples using multicapillary column-ion mobility spectrometry. , 2015, Analytical chemistry.

[5]  Cyril Ruckebusch,et al.  On the implementation of spatial constraints in multivariate curve resolution alternating least squares for hyperspectral image analysis , 2015 .

[6]  R. Manne,et al.  Use of convexity for finding pure variables in two-way data from mixtures , 2000 .

[7]  R. Tauler,et al.  Compression of multidimensional NMR spectra allows a faster and more accurate analysis of complex samples. , 2018, Chemical communications.

[8]  Richard M. Wallace,et al.  ANALYSIS OF ABSORPTION SPECTRA OF MULTICOMPONENT SYSTEMS1 , 1960 .

[9]  Lars Kai Hansen,et al.  Archetypal analysis for machine learning , 2010, 2010 IEEE International Workshop on Machine Learning for Signal Processing.

[10]  Marina Cocchi,et al.  Exploring local spatial features in hyperspectral images , 2020 .

[11]  Zaïd Harchaoui,et al.  Fast and Robust Archetypal Analysis for Representation Learning , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  E. A. Sylvestre,et al.  Self Modeling Curve Resolution , 1971 .

[13]  Edmund R. Malinowski,et al.  Obtaining the key set of typical vectors by factor analysis and subsequent isolation of component spectra , 1982 .

[14]  Cyril Ruckebusch,et al.  Constraining shape smoothness in multivariate curve resolution–alternating least squares , 2015 .

[15]  Maurice D. Craig,et al.  Minimum-volume transforms for remotely sensed data , 1994, IEEE Trans. Geosci. Remote. Sens..

[16]  David Manuel-Navarrete,et al.  Design and quality criteria for archetype analysis , 2019, Ecology and Society.

[17]  Klaus Neymeyr,et al.  Multivariate curve resolution methods and the design of experiments , 2020 .

[18]  Róbert Rajkó,et al.  Natural duality in minimal constrained self modeling curve resolution , 2006 .

[19]  C. Ruckebusch,et al.  Essential Spectral Pixels for Multivariate Curve Resolution of Chemical Images. , 2019, Analytical chemistry.

[20]  O Shoval,et al.  Evolutionary Trade-Offs, Pareto Optimality, and the Geometry of Phenotype Space , 2012, Science.

[21]  Marcel Maeder,et al.  Use of local rank‐based spatial information for resolution of spectroscopic images , 2008 .

[22]  W. Windig,et al.  Interactive self-modeling mixture analysis , 1991 .

[23]  C. Ruckebusch,et al.  Reliable multivariate curve resolution of femtosecond transient absorption spectra , 2008 .

[24]  D. R. Cruise Plotting the composition of mixtures on simplex coordinates , 1966 .

[25]  Paul J. Gemperline,et al.  Target transformation factor analysis with linear inequality constraints applied to spectroscopic-chromatographic data , 1986 .

[26]  R. Tauler,et al.  Multivariate curve resolution applied to liquid chromatography—diode array detection , 1993 .

[27]  A Menżyk,et al.  Evidential value of polymeric materials-chemometric tactics for spectral data compression combined with likelihood ratio approach. , 2017, The Analyst.

[28]  Alberto Ferrer,et al.  On-The-Fly Processing of continuous high-dimensional data streams , 2017 .

[29]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[30]  Bruce R. Kowalski,et al.  An extension of the multivariate component-resolution method to three components , 1985 .

[31]  J. Hamilton,et al.  Mixture analysis using factor analysis. II: Self‐modeling curve resolution , 1990 .

[32]  Jelena Kovacevic,et al.  Intelligent Acquisition and Learning of Fluorescence Microscope Data Models , 2009, IEEE Transactions on Image Processing.

[33]  Libo Cao,et al.  Two-dimensional nonlinear wavelet compression of ion mobility spectra of chemical warfare agent simulants. , 2004, Analytical chemistry.

[34]  Edmund R. Malinowski,et al.  Factor Analysis in Chemistry , 1980 .

[35]  D. Massart,et al.  Orthogonal projection approach applied to peak purity assessment. , 1996, Analytical chemistry.

[36]  R. Rajkó Studies on the adaptability of different Borgen norms applied in self‐modeling curve resolution (SMCR) method , 2009 .

[37]  G. Kateman,et al.  Multicomponent self-modelling curve resolution in high-performance liquid chromatography by iterative target transformation analysis , 1985 .

[38]  Giancarlo Ragozini,et al.  On the use of archetypes as benchmarks , 2008 .

[39]  H. Martens,et al.  Light scattering and light absorbance separated by extended multiplicative signal correction. application to near-infrared transmission analysis of powder mixtures. , 2003, Analytical chemistry.