A scoring metric for multivariate data for reproducibility analysis using chemometric methods.

Process quality control and reproducibility in emerging measurement fields such as metabolomics is normally assured by interlaboratory comparison testing. As a part of this testing process, spectral features from a spectroscopic method such as nuclear magnetic resonance (NMR) spectroscopy are attributed to particular analytes within a mixture, and it is the metabolite concentrations that are returned for comparison between laboratories. However, data quality may also be assessed directly by using binned spectral data before the time-consuming identification and quantification. Use of the binned spectra has some advantages, including preserving information about trace constituents and enabling identification of process difficulties. In this paper, we demonstrate the use of binned NMR spectra to conduct a detailed interlaboratory comparison and composition analysis. Spectra of synthetic and biologically-obtained metabolite mixtures, taken from a previous interlaboratory study, are compared with cluster analysis using a variety of distance and entropy metrics. The individual measurements are then evaluated based on where they fall within their clusters, and a laboratory-level scoring metric is developed, which provides an assessment of each laboratory's individual performance.

[1]  Nasir M. Rajpoot,et al.  Local discriminant wavelet packet basis for texture classification , 2003, SPIE Optics + Photonics.

[2]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[3]  Leonard Steinborn,et al.  International Organization for Standardization ISO/IEC 17025 General Requirements for the Competence of Testing and Calibration Laboratories , 2004 .

[4]  Yu Gu,et al.  New pattern recognition system in the e-nose for Chinese spirit identification* , 2016 .

[5]  Minglai Shao,et al.  Text Similarity Computing Based on LDA Topic Model and Word Co-occurrence , 2014, ICSE 2014.

[6]  Yinzhi Zhang,et al.  1H NMR-based urine metabolomics for the evaluation of kidney injury in Wistar rats by 3-MCPD. , 2016, Toxicology research.

[7]  Catherine Combes,et al.  Clustering using principal component analysis applied to autonomy-disability of elderly people , 2013, Decis. Support Syst..

[8]  D. Volmer,et al.  Monitoring the Authenticity of Organic Grape Juice via Chemometric Analysis of Elemental Data , 2016, Food Analytical Methods.

[9]  Sung-Hyuk Cha Comprehensive Survey on Distance/Similarity Measures between Probability Density Functions , 2007 .

[10]  R. Betti,et al.  Data‐based structural health monitoring using small training data sets , 2015 .

[11]  Shifei Ding,et al.  Multi-class LSTMSVM based on optimal directed acyclic graph and shuffled frog leaping algorithm , 2016, Int. J. Mach. Learn. Cybern..

[12]  E. Hancock,et al.  Measuring graph similarity through continuous-time quantum walks and the quantum Jensen-Shannon divergence. , 2015, Physical review. E, Statistical, nonlinear, and soft matter physics.

[13]  Maria Lisa Clodoveo,et al.  Chemometric analysis for discrimination of extra virgin olive oils from whole and stoned olive pastes. , 2016, Food chemistry.

[14]  M. Jakubowska,et al.  Unsupervised pattern recognition methods in ciders profiling based on GCE voltammetric signals. , 2016, Food chemistry.

[15]  Bipin Kumar Tripathi,et al.  Biometric recognition by hybridization of evolutionary fuzzy clustering with functional neural networks , 2014, J. Ambient Intell. Humaniz. Comput..

[16]  Jiandong Wang,et al.  Margin distribution explanation on metric learning for nearest neighbor classification , 2016, Neurocomputing.

[17]  A.I. Trivedi,et al.  Feature Extraction Using Wavelet-PCA and Neural Network for Application of Object Classification & Face Recognition , 2010, 2010 Second International Conference on Computer Engineering and Applications.

[18]  K. Hamacher,et al.  Clustering of Giant Virus-DNA Based on Variations in Local Entropy , 2014, Viruses.

[19]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[20]  J. C. Angulo,et al.  Jensen–Shannon and Kullback–Leibler divergences as quantifiers of relativistic effects in neutral atoms , 2015 .

[21]  Mark R Viant,et al.  International NMR-based environmental metabolomics intercomparison exercise. , 2009, Environmental science & technology.

[22]  Robert R. Sokal,et al.  A statistical method for evaluating systematic relationships , 1958 .

[23]  Han Che,et al.  A real time method of contaminant classification using conventional water quality sensors. , 2015, Journal of environmental management.

[24]  P. F. de Aguiar,et al.  Phenolic compounds of Brazilian beers from different types and styles and application of chemometrics for modeling antioxidant capacity. , 2016, Food Chemistry.

[25]  Sevcan Aytac Korkmaz,et al.  RETRACTED: Diagnosis of cervical cancer cell taken from scanning electron and atomic force microscope images of the same patients using discrete wavelet entropy energy and Jensen Shannon, Hellinger, Triangle Measure classifier , 2016 .

[26]  Vinicius Alves Pessanha,et al.  Proposta para acreditação da divisão de produção de painéis sorológicos de Bio-Manguinhos / Fiocruz na norma ISO / IEC 17043 - Conformity Assessment – General Requirements for Proficiency Testing , 2011 .

[27]  Kristen M. Altenburger,et al.  Puzzlecluster: A novel unsupervised clustering algorithm for binning dna fragments in metagenomics , 2015 .

[28]  Fuchun Guo,et al.  Distance-Based Encryption: How to Embed Fuzziness in Biometric-Based Encryption , 2016, IEEE Trans. Inf. Forensics Secur..

[29]  L. Tenori,et al.  Performance Assessment in Fingerprinting and Multi Component Quantitative NMR Analyses. , 2015, Analytical Chemistry.

[30]  Yasin Khan,et al.  Partial discharge pattern analysis using PCA and back-propagation artificial neural network for the estimation of size and position of metallic particle adhering to spacer in GIS , 2016 .

[31]  E. Hellinger,et al.  Neue Begründung der Theorie quadratischer Formen von unendlichvielen Veränderlichen. , 1909 .