Compositional data analysis for elemental data in forensic science.

Discrimination of material based on elemental composition was achieved within a compositional data (CoDa) analysis framework in a form appropriate for use in forensic science. The methods were carried out on example data from New Zealand nephrite. We have achieved good separation of the in situ outcrops of nephrite from within a well-defined area. The most significant achievement of working within the CoDa analysis framework is that the implications of the constraints on the data are acknowledged and dealt with, not ignored. The full composition was reduced based on collinearity of elements, principal components analysis (PCA) and scalings from a backwards linear discriminant analysis (LDA). Thus, a descriptive subcomposition was used for the final discrimination, using LDA, and proved to be more successful than using the full composition. The classification based on the LDA model showed a mean error rate of 2.9% when validated using a 10 repeat, three-fold cross-validation. The methods presented lend objectivity to the process of interpretation, rather than relying on subjective pattern matching type approaches.

[1]  E. Pitman Significance Tests Which May be Applied to Samples from Any Populations , 1937 .

[2]  Rolph E. Anderson,et al.  Multivariate data analysis (4th ed.): with readings , 1995 .

[3]  A. Cooper Concentrically zoned ultramafic pods from the Haast schist zone, South Island, New Zealand , 1976 .

[4]  G. Mateu-Figueras,et al.  Compositional Data Analysis in the Geosciences: From Theory to Practice , 2006 .

[5]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[6]  Raimon Tolosana-Delgado,et al.  "compositions": A unified R package to analyze compositional data , 2008, Comput. Geosci..

[7]  A. M. Finlayson The Nephrite and Magnesian Rocks of the South Island of New Zealand , 1909, Quarterly Journal of the Geological Society of London.

[8]  Christina Gloeckner,et al.  Modern Applied Statistics With S , 2003 .

[9]  Raimon Tolosana-Delgado,et al.  Robustness in compositional data analysis , 2007 .

[10]  K. Gabriel,et al.  The biplot graphic display of matrices with application to principal component analysis , 1971 .

[11]  Karl Rihaczek,et al.  1. WHAT IS DATA MINING? , 2019, Data Mining for the Social Sciences.

[12]  K. Pearson Mathematical contributions to the theory of evolution.—On a form of spurious correlation which may arise when indices are used in the measurement of organs , 1897, Proceedings of the Royal Society of London.

[13]  C. G. G. Aitken,et al.  Evaluation of transfer evidence for three-level multivariate data with the use of graphical models , 2006, Comput. Stat. Data Anal..

[14]  P. Koons A study of natural and experimental metasomatic assemblages in an ultramafic-quartzofeldspathic metasomatic system from the haast schist, South Island, New Zealand , 1981 .

[15]  H. Campbell,et al.  Characterisation and origin of New Zealand nephrite jade using its strontium isotopic signature , 2007 .

[16]  Y. Kawachi Geology and petrochemistry of weakly metamorphosed rocks in the upper Wakatipu district, southern New Zealand , 1974 .

[17]  V. Pawlowsky-Glahn,et al.  Dealing with Zeros and Missing Values in Compositional Data Sets Using Nonparametric Imputation , 2003 .

[18]  E. J. G. Pitman,et al.  Significance Tests Which May be Applied to Samples from Any Populations. II. The Correlation Coefficient Test , 1937 .

[19]  Gareth P Campbell,et al.  The interpretation of elemental composition measurements from forensic glass evidence III. , 1997, Science & justice : journal of the Forensic Science Society.

[20]  John Aitchison,et al.  The Statistical Analysis of Compositional Data , 1986 .