Application of Modern Supervised Pattern Recognition Methods in Chemometrics

Over the past two decades there has been a growth in pattern recognition methods, especially catalysed by machine learning community and the rapid growth in computing power. Methods developed two or three decades ago such as principal components analysis, cross-validation and partial least squares, required limited computing power and are now embedded into modern software packages. Moore's law as variously described suggests a doubling of computer speeds every 2 years, or over 30,000 times increase in 30 years, yet modern pre-packaged chemometric software has not kept pace. Many problems are non-linear especially outside mainstream analytical chemistry and as such are require approaches often more usual in areas such as economics or biology. In addition proper validation and optimisation usually requires significant iterations, for example using a bootstrap and test / training set splits might require a model to be reformed 20,000 times. In addition, self organising maps are a powerful alternative to principal components for the visualisation of relationships between samples. These methods are illustrated on a dataset of ancient Italian pottery coming from different sites.