Unsupervised phase mapping of X-ray diffraction data by nonnegative matrix factorization integrated with custom clustering

Analyzing large X-ray diffraction (XRD) datasets is a key step in high-throughput mapping of the compositional phase diagrams of combinatorial materials libraries. Optimizing and automating this task can help accelerate the process of discovery of materials with novel and desirable properties. Here, we report a new method for pattern analysis and phase extraction of XRD datasets. The method expands the Nonnegative Matrix Factorization method, which has been used previously to analyze such datasets, by combining it with custom clustering and cross-correlation algorithms. This new method is capable of robust determination of the number of basis patterns present in the data which, in turn, enables straightforward identification of any possible peak-shifted patterns. Peak-shifting arises due to continuous change in the lattice constants as a function of composition and is ubiquitous in XRD datasets from composition spread libraries. Successful identification of the peak-shifted patterns allows proper quantification and classification of the basis XRD patterns, which is necessary in order to decipher the contribution of each unique single-phase structure to the multi-phase regions. The process can be utilized to determine accurately the compositional phase diagram of a system under study. The presented method is applied to one synthetic and one experimental dataset and demonstrates robust accuracy and identification abilities.XRD mapping: pattern analysis and phase extractionAn algorithm can extract the peak-shifted patterns and phase diagrams of a given material from large X-ray diffraction (XDR) datasets. A team led by Ichiro Takeuchi from the University of Maryland and Boian Alexandrov from Los Alamos National Laboratory developed a new computational method based on non-negative matrix factorization and cross-correlation analysis, capable of identifying peak-shifted patterns in XRD datasets. Such features are due to changes in the lattice constant of a given material, and are thus of crucial importance for determination of its phase structure and composition. By applying this approach to both a synthetic and an experimental dataset, the authors were able to successfully detect peak-shifting, and thus extract the compositional phase diagram of the materials under investigation.

[1]  Velimir V. Vesselinov,et al.  Identification of release sources in advection–diffusion system by machine learning combined with Green’s function inverse method , 2018, Applied Mathematical Modelling.

[2]  John M. Gregoire,et al.  Perspective: Composition–structure–property mapping in high-throughput experiments: Turning data into knowledge , 2016 .

[3]  David Brie,et al.  Non-negative source separation: range of admissible solutions and conditions for the uniqueness of the solution , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[4]  Stefano Ermon,et al.  Pattern Decomposition with Complex Combinatorial Constraints: Application to Materials Discovery , 2014, AAAI.

[5]  Tim Mueller,et al.  Machine Learning in Materials Science , 2016 .

[6]  Michael J. Fasolka,et al.  Combinatorial Materials Synthesis , 2003 .

[7]  I. Takeuchi,et al.  Rapid structural mapping of ternary metallic alloy systems using the combinatorial approach and cluster analysis. , 2007, The Review of scientific instruments.

[8]  Wotao Yin,et al.  A Block Coordinate Descent Method for Regularized Multiconvex Optimization with Applications to Nonnegative Tensor Factorization and Completion , 2013, SIAM J. Imaging Sci..

[9]  Patrik O. Hoyer,et al.  Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[10]  Jianjun Hu,et al.  Semi-Supervised Approach to Phase Identification from Combinatorial Sample Diffraction Patterns , 2016 .

[11]  Ichiro Takeuchi,et al.  Monolithic multichannel ultraviolet detector arrays and continuous phase evolution in MgxZn1−xO composition spreads , 2003 .

[12]  Hideomi Koinuma,et al.  Combinatorial Synthesis and Evaluation of Functional Inorganic Materials Using Thin-Film Techniques , 2002 .

[13]  P. Paatero,et al.  Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values† , 1994 .

[14]  D. O'Malley,et al.  Contaminant source identification using semi-supervised machine learning. , 2017, Journal of contaminant hydrology.

[15]  H. Koinuma,et al.  Combinatorial solid-state chemistry of inorganic materials , 2004, Nature materials.

[16]  Ronan Le Bras,et al.  Automated Phase Mapping with AgileFD and its Application to Light Absorber Discovery in the V-Mn-Nb Oxide System. , 2016, ACS combinatorial science.

[17]  Jonathan Kenneth Bunn,et al.  High-throughput Diffraction and Spectroscopic Data for Fe-Cr-Al Oxidation Studies , 2015 .

[18]  Ludmil B. Alexandrov,et al.  Abstract IA11: Signatures of mutational processes in human cancer , 2017 .

[19]  Ronan Le Bras,et al.  Phase-Mapper: An AI Platform to Accelerate High Throughput Materials Discovery , 2016, AAAI.

[20]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[21]  Ronan Le Bras,et al.  A computational challenge problem in materials discovery: synthetic problem generator and real-world datasets , 2014, AAAI 2014.

[22]  Filip L Iliev,et al.  Nonnegative Matrix Factorization for identification of unknown number of sources emitting delayed signals , 2016, PloS one.

[23]  R. Ramprasad,et al.  Machine Learning in Materials Science , 2016 .

[24]  Xiaojun Wu,et al.  Graph Regularized Nonnegative Matrix Factorization for Data Representation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  M. Stratton,et al.  Deciphering Signatures of Mutational Processes Operative in Human Cancer , 2013, Cell reports.

[26]  I Takeuchi,et al.  High-throughput determination of structural phase diagram and constituent phases using GRENDEL , 2015, Nanotechnology.

[27]  Simon Haykin,et al.  The Cocktail Party Problem , 2005, Neural Computation.

[28]  Ichiro Takeuchi,et al.  Comparison of dissimilarity measures for cluster analysis of X-ray diffraction data from combinatorial libraries , 2017, npj Computational Materials.

[29]  Kristoffer Hougaard Madsen,et al.  Shifted Non-Negative Matrix Factorization , 2007, 2007 IEEE Workshop on Machine Learning for Signal Processing.

[30]  Andrzej Cichocki,et al.  Nonnegative Matrix and Tensor Factorization T , 2007 .

[31]  I Takeuchi,et al.  Rapid identification of structural phases in combinatorial thin-film libraries using x-ray diffraction and non-negative matrix factorization. , 2009, The Review of scientific instruments.

[32]  Ronan Le Bras,et al.  Constraint Reasoning and Kernel Clustering for Pattern Decomposition with Scaling , 2011, CP.

[33]  L. A. Knauss,et al.  Identification of novel compositions of ferromagnetic shape-memory alloys using composition spreads , 2003, Nature materials.

[34]  Manuel Moliner,et al.  A reliable methodology for high throughput identification of a mixture of crystallographic phases from powder X-ray diffraction data , 2008 .

[35]  Ronan Le Bras,et al.  Challenges in Materials Discovery - Synthetic Generator and Real Datasets , 2014, AAAI.

[36]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[37]  Shigehiro Fujino Combinatorial discovery of a morphotropic phase boundary in a lead-free piezoelectric material , 2008 .

[38]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[39]  Velimir V. Vesselinov,et al.  Blind source separation for groundwater pressure analysis based on nonnegative matrix factorization , 2014 .

[40]  Masashi Kawasaki,et al.  Rapid construction of a phase diagram of doped Mott insulators with a composition-spread approach , 2000 .

[41]  Jianjun Hu,et al.  Inferring phase diagrams from X-ray data with background signals using graph segmentation , 2018 .