Determination of the Number of Significant Components in Liquid Chromatography Nuclear Magnetic Resonance Spectroscopy

Abstract In this paper, the effectiveness of methods for determining the number of significant components is evaluated in four simulated and four experimental liquid chromatography nuclear magnetic resonance (LC–NMR) spectrometric datasets. The following methods are tested: eigenvalues, log eigenvalues and eigenvalue ratios from principal component analysis (PCA) of the overall data; error indicator functions [residual sum of squares (rssq), residual standard deviation (RSD), ratio of successive residual standard deviations (RSDRatio), root mean square error (RMS), imbedded error (IE), factor indicator functions, scree test and Exner function], together with their ratio of derivatives (ROD); F-test (Malinowski, Faber–Kowalski and modified FK); cross-validation; morphological score (MS); purity-based approaches including orthogonal projection approach (OPA) and SIMPLISMA; correlation and derivative plots; evolving PCA (EPCA) and evolving PC innovation analysis (EPCIA); subspace comparison. Five sets of methods are selected as best, including several error indicator functions, their ratio of derivatives, the residual standard deviation ratio, orthogonal projection approach (OPA) concentration profiles and evolving PCA using an expanding window (EW). Omitting the dataset with the highest noise level, RSS, Malinowski's F-test, concentration profiles using SIMPLISMA and subspace comparison with PCA score also perform well.

[1]  Klaas Faber,et al.  Critical evaluation of two F-tests for selecting the number of factors in abstract factor analysis , 1997 .

[2]  R. Brereton,et al.  Resolution of LC/1H NMR data applied to a three‐component mixture of polyaromatic hydrocarbons , 2002 .

[3]  Johanna Smeyers-Verbeke,et al.  Handbook of Chemometrics and Qualimetrics: Part A , 1997 .

[4]  Window evolving factor analysis for assessment of peak homogeneity in liquid chromatography , 1993 .

[5]  D. L. Massart,et al.  Eigenstructure tracking analysis for assessment of peak purity in high-performance liquid chromatography with diode array detection , 1995 .

[6]  Richard G. Brereton,et al.  A comparison of deconvolution methods as applied to high performance liquid chromatography-diode array detector-electrospray mass spectrometry of 2- and 3-hydroxypyridine at varying pH in the presence of severely tailing peak shapes , 1999 .

[7]  Edmund R. Malinowski,et al.  Statistical F‐tests for abstract factor analysis and target testing , 1989 .

[8]  J. Gani,et al.  Perspectives in Probability and Statistics. , 1980 .

[9]  Richard G. Brereton,et al.  Chemometrics: Data Analysis for the Laboratory and Chemical Plant , 2003 .

[10]  H. R. Keller,et al.  Peak purity control in liquid chromatography with photodiode-array detection by a fixed size moving window evolving factor analysis , 1991 .

[11]  Jostein Toft,et al.  Evolutionary rank analysis applied to multidetectional chromatographic structures , 1995 .

[12]  A. Savitzky,et al.  Smoothing and Differentiation of Data by Simplified Least Squares Procedures. , 1964 .

[13]  I. Warner,et al.  Rank estimation of excitation-emission matrices using frequency analysis of eigenvectors. , 1986, Analytical chemistry.

[14]  Peter D. Wentzell,et al.  Parallel Kalman filters for peak purity analysis: extensions to non-ideal detector response , 1995 .

[15]  C. Heckler,et al.  Self-modeling mixture analysis of categorized pyrolysis mass spectral data with the SIMPLISMA approach , 1992 .

[16]  I. Jolliffe Principal Component Analysis , 2002 .

[17]  R. Brereton,et al.  Resolution of on‐flow LC/NMR data by multivariate methods — a comparison , 2002 .

[18]  Olav M. Kvalheim,et al.  Determination of chemical rank of two-way data from mixtures using subspace comparisons , 2000 .

[19]  R. Cattell The Scree Test For The Number Of Factors. , 1966, Multivariate behavioral research.

[20]  D. W. Osten,et al.  Selection of optimal regression models via cross‐validation , 1988 .

[21]  S. Wold Cross-Validatory Estimation of the Number of Components in Factor and Principal Components Models , 1978 .

[22]  Edmund R. Malinowski,et al.  Determination of the number of factors and the experimental error in a data matrix , 1977 .

[23]  Paul Geladi,et al.  Principal Component Analysis , 1987, Comprehensive Chemometrics.

[24]  Roland F. Hirsch,et al.  Reliability of factor analysis in the presence of random noise or outlying data , 1987 .

[25]  T. J. Klingen,et al.  Correlation of retention volumes of substitutued carboranes with molecular properties in high pressure liquid chromatography using factor analysis , 1974 .

[26]  Nicolaas M. Faber Modification of Malinowski's F-test for Pseudo Rank Estimation Revisited , 1999, Comput. Chem..

[27]  R. Brereton,et al.  Resolution of on-flow liquid chromatography proton nuclear magnetic resonance using canonical correlation and constrained linear regression , 2002 .

[28]  R. Brereton,et al.  Evaluation of chemometric methods for determining the number and position of components in high-performance liquid chromatography detected by diode array detector and on-flow 1H nuclear magnetic resonance spectroscopy , 2003 .

[29]  Stephen G. Walburn,et al.  Collection and analysis of hazardous organic emissions , 1982 .

[30]  Milan Meloun,et al.  Critical comparison of methods predicting the number of components in spectroscopic data , 2000 .

[31]  Olav M. Kvalheim,et al.  Eigenstructure tracking analysis for revealing noise pattern and local rank in instrumental profiles: application to transmittance and absorbance IR spectroscopy , 1993 .

[32]  J. Mandel A New Analysis of Variance Model for Non-additive Data , 1971 .

[33]  Edmund R. Malinowski,et al.  Factor Analysis in Chemistry , 1980 .

[34]  D. Massart,et al.  Determination of the number of components during mixture analysis using the Durbin–Watson criterion in the Orthogonal Projection Approach and in the SIMPLe-to-use Interactive Self-modelling Mixture Analysis approach , 2002 .

[35]  Edmund R. Malinowski,et al.  Abstract factor analysis of data with multiple sources of error and a modified Faber–Kowalski f‐test † , 1999 .

[36]  Peter D. Wentzell,et al.  Real-Time Principal Component Analysis Using Parallel Kalman Filter Networks for Peak Purity Analysis , 1991 .

[37]  D. Massart,et al.  Orthogonal projection approach applied to peak purity assessment. , 1996, Analytical chemistry.

[38]  Characterization of the effect of peak shifts on the performance of the Kalman filter in multicomponent analyses , 1989 .

[39]  Alan S. Stern,et al.  NMR Data Processing , 1996 .

[40]  R. Brereton,et al.  Chemometric methods for determination of selective regions in diode array detection high performance liquid chromatography of mixtures: application to chlorophyll a allomers , 1998 .

[41]  Laila Stordrange,et al.  The morphological score and its application to chemical rank determination , 2000 .

[42]  M. Maeder Evolving factor analysis for the resolution of overlapping chromatographic peaks , 1987 .

[43]  N. Ohta,et al.  Estimating absorption bands of component dyes by means of principal component analysis , 1973 .

[44]  Desire L. Massart,et al.  Multivariate peak purity approaches , 1996 .

[45]  Sarah C. Rutan,et al.  Kalman filtering approaches for solving problems in analytical chemistry , 1987 .

[46]  N. Draper,et al.  Applied Regression Analysis , 1967 .

[47]  Mikael Kubista,et al.  An automated procedure to predict the number of components in spectroscopic data , 1999 .