The evolution of chemometrics

Abstract Chemometrics is the application of statistical and mathematical methods to chemical problems to permit maximal collection and extraction of useful information. The development of advanced chemical instruments and processes has led to a need for advanced methods to design experiments, calibrate instruments, and analyze the resulting data. For many years, there was the prevailing view that if one needed fancy data analyses, then the experiment was not planned correctly, but now it is recognized that most systems are multivariate in nature and univariate approaches are unlikely to result in optimum solutions. At the same time, instruments have evolved in complexity, computational capability has similarly advanced so that it has been possible to develop and employ increasing complex and computationally intensive methods. In this paper, the development of chemometrics as a subfield of chemistry and particularly analytical chemistry will be presented with a view of the current state-of-the-art and the prospects for the future will be presented.

[1]  Philip K. Hopke,et al.  Omparison of Weighted and Unweighted Target Transformation Rotations in Factor Analysis , 1981, Comput. Chem..

[2]  D. Massart,et al.  UNEQ: a disjoint modelling technique for pattern recognition based on normal distribution , 1986 .

[3]  Gerrit Kateman,et al.  Optimization of calibration data with the dynamic genetic algorithm , 1992 .

[4]  P. Paatero,et al.  Understanding and controlling rotations in factor analytic models , 2002 .

[5]  John H. Kalivas Optimization using variations of simulated annealing , 1992 .

[6]  S. Grossberg,et al.  Adaptive pattern classification and universal recoding: I. Parallel development and coding of neural feature detectors , 1976, Biological Cybernetics.

[7]  Nello Cristianini,et al.  Linear Learning Machines , 2000 .

[8]  J. Zupan,et al.  Neural networks: A new method for solving chemical problems or just a passing phase? , 1991 .

[9]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[10]  G. Kateman,et al.  Multicomponent self-modelling curve resolution in high-performance liquid chromatography by iterative target transformation analysis , 1985 .

[11]  Emile H. L. Aarts,et al.  Simulated Annealing: Theory and Applications , 1987, Mathematics and Its Applications.

[12]  John D. Spengler,et al.  A QUANTITATIVE ASSESSMENT OF SOURCE CONTRIBUTIONS TO INHALABLE PARTICULATE MATTER POLLUTION IN METROPOLITAN BOSTON , 1985 .

[13]  Bruce R. Kowalski,et al.  Chemometrics: Views and Propositions , 1975, J. Chem. Inf. Comput. Sci..

[14]  P. Paatero,et al.  Application of positive matrix factorization in source apportionment of particulate pollutants in Hong Kong , 1999 .

[15]  P. Hopke,et al.  Exploration of multivariate chemical data by projection pursuit , 1992 .

[16]  C. Lewis,et al.  Vehicle-Related Hydrocarbon Source Compositions from Ambient Data: The GRACE/SAFER Method. , 1994, Environmental science & technology.

[17]  Ronald E. Hester,et al.  Receptor modeling for air quality management , 1997 .

[18]  C. Lewis,et al.  Source Apportionment of Phoenix PM2.5 Aerosol with the Unmix Receptor Model , 2003, Journal of the Air & Waste Management Association.

[19]  Rasmus Bro,et al.  Recent developments in CANDECOMP/PARAFAC algorithms: a critical review , 2003 .

[20]  P. Paatero,et al.  Positive matrix factorization applied to a curve resolution problem , 1998 .

[21]  P. Paatero A weighted non-negative least squares algorithm for three-way ‘PARAFAC’ factor analysis , 1997 .

[22]  M. E. Johnson,et al.  Generalized simulated annealing for function optimization , 1986 .

[23]  C. B. Lucasius,et al.  Genetic algorithms in wavelength selection: a comparative study , 1994 .

[24]  E. A. Sylvestre,et al.  Self Modeling Curve Resolution , 1971 .

[25]  R. Prim Shortest connection networks and some generalizations , 1957 .

[26]  Clifford H. Spiegelman,et al.  Chemometrics and spectral frequency selection , 1991, Philosophical Transactions of the Royal Society of London. Series A: Physical and Engineering Sciences.

[27]  P. Paatero,et al.  Analysis of daily precipitation data by positive matrix factorization , 1994 .

[28]  Application of Computerized Quantitative Infrared Spectroscopy to the Determination of the Principal Lipids Found in Blood Serum , 1987 .

[29]  Philip K. Hopke,et al.  Variable selection in classification of environmental soil samples for partial least square and neural network models , 2001 .

[30]  W. Windig Self-modeling mixture analysis of spectral data with continuous concentration profiles , 1992 .

[31]  H. R. Keller,et al.  Evolving factor analysis , 1991 .

[32]  Philip K. Hopke,et al.  Solving the Chemical Mass Balance Problem Using an Artificial Neural Network , 1996 .

[33]  Stephen Grossberg,et al.  Adaptive pattern classification and universal recoding: II. Feedback, expectation, olfaction, illusions , 1976, Biological Cybernetics.

[34]  P. Hopke Receptor modeling in environmental chemistry , 1985 .

[35]  John H. Kalivas,et al.  Global optimization by simulated annealing with wavelength selection for ultraviolet-visible spectrophotometry , 1989 .

[36]  Paul J. Gemperline,et al.  Wavelength selection and optimization of pattern recognition methods using the genetic algorithm , 2000 .

[37]  J. R. Torres-Lapasió,et al.  Resolution of multicomponent peaks by orthogonal projection approach, positive matrix factorization and alternating least squares , 2000 .

[38]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[39]  P. Hopke,et al.  Atmospheric aerosol over Vermont: chemical composition and sources. , 2001, Environmental science & technology.

[40]  A. Bos,et al.  Artificial neural networks as a multivariate calibration tool: modelling the Fe-Cr-Ni system in X-ray fluorescence spectroscopy , 1993 .

[41]  Philip K. Hopke,et al.  Projection of Prim's minimal spanning tree into a Kohonen neural network for identification of airborne particle sources by their multielement trace patterns , 1994 .

[42]  P. Hopke,et al.  Classification of Single Particles Analyzed by ATOFMS Using an Artificial Neural Network, ART-2A , 1999 .

[43]  R. Henry,et al.  Application of SAFER model to the Los Angeles PM10 data , 2000 .

[44]  P. Hopke,et al.  A new receptor model: A direct trilinear decomposition followed by a matrix reconstruction , 1992 .

[45]  John H. Kalivas,et al.  Further investigation on a comparative study of simulated annealing and genetic algorithm for wavelength selection , 1995 .

[46]  Ronald C. Henry,et al.  Extension of self-modeling curve resolution to mixtures of more than three components: Part 3. Atmospheric aerosol data simulation studies☆ , 1990 .

[47]  J. Ross Quinlan,et al.  Learning Efficient Classification Procedures and Their Application to Chess End Games , 1983 .

[48]  Yoh-Han Pao,et al.  Adaptive pattern recognition and neural networks , 1989 .

[49]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[50]  R. Henry,et al.  Extension of self-modeling curve resolution to mixtures of more than three components: Part 2. Finding the complete solution , 1999 .

[51]  Stephen Grossberg,et al.  ART 2-A: An adaptive resonance algorithm for rapid category learning and recognition , 1991, Neural Networks.

[52]  P. Gemperline,et al.  Spectroscopic calibration and quantitation using artificial neural networks , 1990 .

[53]  Richard G. Brereton,et al.  Chemometrics: Applications of Mathematics and Statistics to Laboratory Systems , 1991 .

[54]  R. Henry,et al.  Extension of self-modeling curve resolution to mixtures of more than three components: Part 1. Finding the basic feasible region , 1990 .

[55]  John H. Kalivas,et al.  Convergence of generalized simulated annealing with variable step size with application towards parameter estimations of linear and nonlinear models , 1991 .

[56]  Philip K. Hopke,et al.  Calibration transfer as a data reconstruction problem , 1999 .

[57]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[58]  P. Hopke,et al.  Comparison of rule-building expert systems with pattern recognition for the classification of analytical data , 1987 .

[59]  P. Hopke,et al.  Equation-oriented system: an efficient programming approach to solve multilinear and polynomial equations by the conjugate gradient algorithm , 2001 .

[60]  Lutgarde M. C. Buydens,et al.  ADAPTIVE RESONANCE THEORY-BASED NEURAL NETWORKS - THE ART OF REAL-TIME PATTERN-RECOGNITION IN CHEMICAL PROCESS MONITORING , 1995 .

[61]  P. Hopke,et al.  Receptor Modeling Assessment of Particle Total Exposure Assessment Methodology Data , 1999 .

[62]  J J Filliben,et al.  Statistical and mathematical methods in analytical chemistry. , 1972, Analytical chemistry.

[63]  Teuvo Kohonen,et al.  Self-Organization and Associative Memory , 1988 .

[64]  P. Paatero Least squares formulation of robust non-negative factor analysis , 1997 .

[65]  S. Hill,et al.  Application of a multi-way method to studylong-term stability in ICP-AES , 2001 .

[66]  G. Kateman,et al.  Neural networks in analytical chemistry , 1993 .

[67]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[68]  Paul Geladi,et al.  Analysis of multi-way (multi-mode) data , 1989 .

[69]  P. Paatero,et al.  Analysis of different modes of factor analysis as least squares fit problems , 1993 .

[70]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[71]  P. Hopke,et al.  Airborne particle classification with a combination of chemical composition and shape index utilizing an adaptive resonance artificial neural network. , 1994, Environmental science & technology.

[72]  P. Hopke,et al.  Application of modified alternating least squares regression to spectroscopic image analysis , 2003 .

[73]  Massoud Motamedi,et al.  A novel peak-hopping stepwise feature selection method with application to Raman spectroscopy , 1999 .

[74]  Wen‐Jun Zhang,et al.  Comparison of different methods for variable selection , 2001 .

[75]  Svante Wold,et al.  Pattern recognition by means of disjoint principal components models , 1976, Pattern Recognit..

[76]  Suilou Huang,et al.  Testing and optimizing two factor-analysis techniques on aerosol at Narragansett, Rhode Island , 1999 .

[77]  P. Geladi,et al.  Multivariate image analysis , 1996 .

[78]  Barry K. Lavine,et al.  Genetic algorithm for fuel spill identification , 2001 .

[79]  Window evolving factor analysis for assessment of peak homogeneity in liquid chromatography , 1993 .

[80]  P. Paatero,et al.  Source identification of bulk wet deposition in Finland by positive matrix factorization , 1995 .

[81]  C H Spiegelman,et al.  A transparent tool for seemingly difficult calibrations: the parallel calibration method. , 2000, Analytical chemistry.

[82]  Desire L. Massart,et al.  The Interpretation of Analytical Chemical Data by the Use of Cluster Analysis , 1983 .

[83]  R. Henrion N-WAY PRINCIPAL COMPONENT ANALYSIS : THEORY, ALGORITHMS AND APPLICATIONS , 1994 .

[84]  D. O. Hebb,et al.  The organization of behavior , 1988 .

[85]  W. Pitts,et al.  How we know universals; the perception of auditory and visual forms. , 1947, The Bulletin of mathematical biophysics.