Self organising maps for visualising and modelling

The paper describes the motivation of SOMs (Self Organising Maps) and how they are generally more accessible due to the wider available modern, more powerful, cost-effective computers. Their advantages compared to Principal Components Analysis and Partial Least Squares are discussed. These allow application to non-linear data, are not so dependent on least squares solutions, normality of errors and less influenced by outliers. In addition there are a wide variety of intuitive methods for visualisation that allow full use of the map space. Modern problems in analytical chemistry include applications to cultural heritage studies, environmental, metabolomic and biological problems result in complex datasets. Methods for visualising maps are described including best matching units, hit histograms, unified distance matrices and component planes. Supervised SOMs for classification including multifactor data and variable selection are discussed as is their use in Quality Control. The paper is illustrated using four case studies, namely the Near Infrared of food, the thermal analysis of polymers, metabolomic analysis of saliva using NMR, and on-line HPLC for pharmaceutical process monitoring.

[1]  Lutgarde M. C. Buydens,et al.  SOMPLS: A supervised self-organising map--partial least squares algorithm for multivariate regression problems , 2007 .

[2]  J. Edward Jackson,et al.  A User's Guide to Principal Components: Jackson/User's Guide to Principal Components , 2004 .

[3]  M. J. Adams,et al.  Chemometrics in Analytical Spectroscopy , 1995 .

[4]  G.E. Moore,et al.  Cramming More Components Onto Integrated Circuits , 1998, Proceedings of the IEEE.

[5]  Eric R. Ziegel,et al.  Statistics and Chemometrics for Analytical Chemistry , 2004, Technometrics.

[6]  Bruce R. Kowalski,et al.  Chemometrics, mathematics and statistics in chemistry , 1984 .

[7]  K. Kaski,et al.  1H NMR metabonomics approach to the disease continuum of diabetic complications and premature death , 2008, Molecular systems biology.

[8]  R. Brereton One‐class classifiers , 2011 .

[9]  Richard G. Brereton,et al.  Chemometrics Tutorials II , 1992 .

[10]  Richard Kramer,et al.  Chemometric Techniques For Quantitative Analysis , 1998 .

[11]  K. Kaski,et al.  1H NMR metabonomics of plasma lipoprotein subclasses: elucidation of metabolic clustering by self‐organising maps , 2007, NMR in biomedicine.

[12]  Richard G. Brereton,et al.  Introduction to multivariate calibration in analytical chemistry , 2000 .

[13]  Ronald Eugene Shaffer,et al.  Multi‐ and Megavariate Data Analysis. Principles and Applications, I. Eriksson, E. Johansson, N. Kettaneh‐Wold and S. Wold, Umetrics Academy, Umeå, 2001, ISBN 91‐973730‐1‐X, 533pp. , 2002 .

[14]  Richard G. Brereton,et al.  Chemometrics for Pattern Recognition , 2009 .

[15]  Charles K. Bayne Practical Guide to Chemometrics , 1995 .

[16]  Gregory A. Mack,et al.  Chemometrics: A Textbook , 1990 .

[17]  J. Edward Jackson,et al.  A User's Guide to Principal Components. , 1991 .

[18]  Teuvo Kohonen,et al.  Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.

[19]  Federico Marini,et al.  Class-modeling using Kohonen artificial neural networks , 2005 .

[20]  B. Manly Multivariate Statistical Methods : A Primer , 1986 .

[21]  Ersin Bayram,et al.  Supervised Self-Organizing Maps in Drug Discovery. 1. Robust Behavior with Overdetermined Data Sets , 2005, J. Chem. Inf. Model..

[22]  Matthias Otto,et al.  Chemometrics: Statistics and Computer Application in Analytical Chemistry , 1999 .

[23]  A. Höskuldsson PLS regression methods , 1988 .

[24]  G. Dunteman Principal Components Analysis , 1989 .

[25]  Eric R. Ziegel,et al.  Chemometrics: Statistics and Computer Application in Analytical Chemistry , 2001, Technometrics.

[26]  David E. Booth,et al.  Chemometrics: Data Analysis for the Laboratory and Chemical Plant , 2004, Technometrics.

[27]  Randall D. Tobias,et al.  Chemometrics: A Practical Guide , 1998, Technometrics.

[28]  Patricia L. Smith Chemometrics: Applications of Mathematics and Statistics to Laboratory Systems , 1993 .

[29]  T. Kohonen Self-Organized Formation of Correct Feature Maps , 1982 .

[30]  Peter Filzmoser,et al.  Introduction to Multivariate Statistical Analysis in Chemometrics , 2009 .

[31]  T. Kohonen Self-organized formation of topographically correct feature maps , 1982 .

[32]  L. Buydens,et al.  Supervised Kohonen networks for classification problems , 2006 .

[33]  Richard G. Brereton,et al.  Window consensus PCA for multiblock statistical process control: adaption to small and time‐dependent normal operating condition regions, illustrated by online high performance liquid chromatography of a three‐stage continuous process , 2010 .

[34]  Richard G. Brereton,et al.  Learning Vector Quantization for Multiclass Classification: Application to Characterization of Plastics , 2007, J. Chem. Inf. Model..

[35]  M. Forina,et al.  Multivariate calibration. , 2007, Journal of chromatography. A.

[36]  S. Wold Cross-Validatory Estimation of the Number of Components in Factor and Principal Components Models , 1978 .

[37]  R. Manne Analysis of two partial-least-squares algorithms for multivariate calibration , 1987 .

[38]  M. V. Velzen,et al.  Self-organizing maps , 2007 .

[39]  R. Brereton,et al.  Supervised self organizing maps for classification and determination of potentially discriminatory variables: illustrated by application to nuclear magnetic resonance metabolomic profiling. , 2010, Analytical chemistry.

[40]  R. Brereton,et al.  One class classifiers for process monitoring illustrated by the application to online HPLC of a continuous process , 2010 .

[41]  Martin Grootveld,et al.  Self Organising Maps for variable selection: Application to human saliva analysed by nuclear magnetic resonance spectroscopy to investigate the effect of an oral healthcare product , 2009 .

[42]  Wolfhard Wegscheider,et al.  Chemometrics Tutorials: Collected from Chemometrics and Intelligent Laboratory Systems - An International Journal, Volumes 1-5 , 1990 .

[43]  Richard G. Brereton,et al.  Pattern recognition and feature selection for the discrimination between grades of commercial plastics , 2007 .

[44]  B. Kowalski,et al.  Partial least-squares regression: a tutorial , 1986 .

[45]  Eric R. Ziegel,et al.  Handbook of Chemometrics and Qualimetrics, Part B , 2000, Technometrics.

[46]  Richard G. Brereton,et al.  Applied Chemometrics for Scientists , 2007 .

[47]  J. E. Jackson A User's Guide to Principal Components , 1991 .

[48]  R. Brereton,et al.  Disjoint hard models for classification , 2010 .

[49]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[50]  Howard Mark,et al.  Chemometrics in Spectroscopy , 2007 .

[51]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[52]  I. Jolliffe Principal Component Analysis , 2002 .

[53]  Richard G Brereton,et al.  Self Organising Maps for distinguishing polymer groups using thermal response curves obtained by dynamic mechanical analysis. , 2008, The Analyst.

[54]  Andrea D. Magrì,et al.  Artificial neural networks in chemometrics: History, examples and perspectives , 2008 .

[55]  R. Brereton,et al.  Self-organizing map quality control index. , 2010, Analytical Chemistry.

[56]  Johanna Smeyers-Verbeke,et al.  Handbook of Chemometrics and Qualimetrics: Part A , 1997 .