An Investigation of Machine Learning Methods Applied to Structure Prediction in Condensed Matter

Materials characterization remains a significant, time-consuming undertaking. Generally speaking, spectroscopic techniques are used in conjunction with empirical and ab-initio calculations in order to elucidate structure. These experimental and computational methods typically require significant human input and interpretation, particularly with regards to novel materials. Recently, the application of data mining and machine learning to problems in material science have shown great promise in reducing this overhead. In the work presented here, several aspects of machine learning are explored with regards to characterizing a model material, titania, using solid-state Nuclear Magnetic Resonance (NMR). Specifically, a large dataset is generated, corresponding to NMR $^{47}$Ti spectra, using ab-initio calculations for generated TiO$_2$ structures. Principal Components Analysis (PCA) reveals that input spectra may be compressed by more than 90%, before being used for subsequent machine learning. Two key methods are used to learn the complex mapping between structural details and input NMR spectra, demonstrating excellent accuracy when presented with test sample spectra. This work compares Support Vector Regression (SVR) and Artificial Neural Networks (ANNs), as one step towards the construction of an expert system for solid state materials characterization.

[1]  Stefano Curtarolo,et al.  A search model for topological insulators with high-throughput robustness descriptors. , 2012, Nature materials.

[2]  Marco Buongiorno Nardelli,et al.  AFLOWLIB.ORG: A distributed materials properties repository from high-throughput ab initio calculations , 2012 .

[3]  Yousef Saad,et al.  Data mining for materials: Computational experiments with AB compounds , 2012 .

[4]  R. Service Computational science. Materials scientists look to a data-intensive future. , 2012, Science.

[5]  Pierre Baldi,et al.  A Machine Learning Approach to Predict Chemical Reactions , 2011, NIPS.

[6]  Matt Probert,et al.  Crystal structure prediction for iron as inner core material in heavy terrestrial planets , 2011 .

[7]  J. Kubicki,et al.  Adsorption of Zn2+ on the (110) Surface of TiO2 (Rutile): A Density Functional Molecular Dynamics Study , 2011 .

[8]  John B. O. Mitchell,et al.  A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking , 2010, Bioinform..

[9]  Stefano de Gironcoli,et al.  QUANTUM ESPRESSO: a modular and open-source software project for quantum simulations of materials , 2009, Journal of physics. Condensed matter : an Institute of Physics journal.

[10]  Karl T. Mueller,et al.  Optimized multiple quantum MAS lineshape simulations in solid state NMR , 2008, Comput. Phys. Commun..

[11]  R. Friesner Ab initio quantum chemistry: methodology and applications. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Antony J. Williams,et al.  Structure Elucidation from 2D NMR Spectra Using the StrucEluc Expert System: Detection and Removal of Contradictions in the Data , 2004, J. Chem. Inf. Model..

[13]  Fernando Pérez-Cruz,et al.  SVM multiregression for nonlinear channel estimation in multiple-input multiple-output systems , 2004, IEEE Transactions on Signal Processing.

[14]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[15]  Donghua H Zhou,et al.  Local structure in perovskite relaxor ferroelectrics: high-resolution 93Nb 3QMAS NMR. , 2004, Journal of magnetic resonance.

[16]  Kristin A. Persson,et al.  Predicting crystal structures with data mining of quantum calculations. , 2003, Physical review letters.

[17]  R. Downs,et al.  The American Mineralogist crystal structure database , 2003 .

[18]  R. Leapman,et al.  A structural model for Alzheimer's β-amyloid fibrils based on experimental constraints from solid state NMR , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[19]  F. Mauri,et al.  All-electron magnetic response with pseudopotentials: NMR chemical shifts , 2001, cond-mat/0101257.

[20]  L. V. Dmitrieva,et al.  A comparison of NMR spectral parameters of 47Ti and 49Ti nuclei in rutile and anatase , 1999 .

[21]  G. Scuseria,et al.  Assessment of the Perdew–Burke–Ernzerhof exchange-correlation functional , 1999 .

[22]  R J Read,et al.  Crystallography & NMR system: A new software suite for macromolecular structure determination. , 1998, Acta crystallographica. Section D, Biological crystallography.

[23]  J. C. BurgesChristopher A Tutorial on Support Vector Machines for Pattern Recognition , 1998 .

[24]  G. Montelione,et al.  Automated analysis of protein NMR assignments using methods from artificial intelligence. , 1997, Journal of molecular biology.

[25]  P. Man Second-order quadrupole effects on Hahn echoes in fast-rotating solids at the magic angle , 1997 .

[26]  Achilleas Zapranis,et al.  Stock performance modeling using neural networks: A comparative study with regression models , 1994, Neural Networks.

[27]  D. Longmore The principles of magnetic resonance. , 1989, British medical bulletin.

[28]  I. D. Brown,et al.  INORGANIC CRYSTAL STRUCTURE DATABASE , 1981 .

[29]  W. Kohn,et al.  Self-Consistent Equations Including Exchange and Correlation Effects , 1965 .