Study of spectral analytical data using fingerprints and scaled similarity measurements

A new chemoinformatic model has been developed for enlarging the differences between spectra and applied to differentiation of wines according to the criteria grape origin and variety and ageing process. The model is based on generation of fingerprints from normalised spectra, using empirical parameters and a set of 120 samples. After generation of the fingerprints, similarity matrixes were built on the basis of the Tanimoto similarity index between the fingerprints of the samples. Calculation of the Tanimoto index was modified to adapt the index to the characteristics of the analytical measurements. Thus, scaling factors taking into account pattern fingerprints generated from a group of samples with common characteristics were used. In addition, a modified expression for calculating the Tanimoto index was employed. Principal-components analysis (PCA) and soft independent modelling of class analogy (SIMCA) were applied to the similarity matrixes. The results obtained are discussed as a function of the normalisation method employed, the empirical factor used in generation of the fingerprints, and selection of samples for building the pattern fingerprint, etc. Finally, results from differentiation of wines are compared with those obtained by applying PCA to the unprocessed spectra as stated by the proposed model.

[1]  Giuseppina C. Gini,et al.  The Importance of Scaling in Data Mining for Toxicity Prediction , 2002, J. Chem. Inf. Comput. Sci..

[2]  M. Cran,et al.  Quantitative Analysis of Polyethylene Blends by Fourier Transform Infrared Spectroscopy , 2003, Applied spectroscopy.

[3]  Ioannis S. Arvanitoyannis,et al.  Application of quality control methods for assessing wine authenticity : Use of multivariate analysis (chemometrics) , 1999 .

[4]  Jung-Han Kim,et al.  Capillary electrophoretic profiling and pattern recognition analysis of urinary nucleosides from thyroid cancer patients , 2003 .

[5]  P. Lasch,et al.  Antemortem identification of bovine spongiform encephalopathy from serum using infrared spectroscopy. , 2003, Analytical chemistry.

[6]  C. Garcia-Jares,et al.  GC‐MS identification of volatile components of Galician (Northwestern Spain) white wines. Application to differentiate Rías Baixas wines from wines produced in nearby geographical regions , 1995 .

[7]  I. Ruisanchez,et al.  Validation of qualitative analytical methods , 2004 .

[8]  G. Hammond,et al.  Identification of triterpene hydroxycinnamates with in vitro antitumor activity from whole cranberry fruit (Vaccinium macrocarpon). , 2003, Journal of agricultural and food chemistry.

[9]  H. Büning-Pfaue Analysis of water in food by near infrared spectroscopy , 2003 .

[10]  M. Blanco,et al.  Characterization and analysis of polymorphs by near-infrared spectrometry , 2004 .

[11]  K. Varmuza,et al.  Spectral similarity versus structural similarity: infrared spectroscopy , 2003 .

[12]  John M. Barnard,et al.  Chemical Similarity Searching , 1998, J. Chem. Inf. Comput. Sci..

[13]  M. Bruch,et al.  Nuclear magnetic resonance analysis of silica gel surfaces modified with mixed, amine-containing ligands. , 2003, Journal of chromatography. A.

[14]  M. Emmett Determination of post-translational modifications of proteins by high-sensitivity, high-resolution Fourier transform ion cyclotron resonance mass spectrometry. , 2003, Journal of chromatography. A.

[15]  Haifeng He,et al.  Quantitative analysis of synthetic polymers using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. , 2003, Analytical chemistry.

[16]  E. Albanell,et al.  Determination of fat, protein, casein, total solids, and somatic cell count in goat's milk by near-infrared reflectance spectroscopy. , 2003, Journal of AOAC International.

[17]  Kurt Varmuza,et al.  Clustering and similarity of chemical structures represented by binary substructure descriptors , 2003 .

[18]  C. G. Pinto,et al.  A method for the detection of hydrocarbon pollution in soils by headspace mass spectrometry and pattern recognition techniques. , 2003, Analytical chemistry.