A Comparative Study of Machine Learning Models for Predicting the State of Reactive Mixing

Accurate predictions of reactive mixing are critical for many Earth and environmental science problems. To investigate mixing dynamics over time under different scenarios, a high-fidelity, finite-element-based numerical model is built to solve the fast, irreversible bimolecular reaction-diffusion equations to simulate a range of reactive-mixing scenarios. A total of 2,315 simulations are performed using different sets of model input parameters comprising various spatial scales of vortex structures in the velocity field, time-scales associated with velocity oscillations, the perturbation parameter for the vortex-based velocity, anisotropic dispersion contrast, and molecular diffusion. Outputs comprise concentration profiles of the reactants and products. The inputs and outputs of these simulations are concatenated into feature and label matrices, respectively, to train 20 different machine learning (ML) emulators to approximate system behavior. The 20 ML emulators based on linear methods, Bayesian methods, ensemble learning methods, and multilayer perceptron (MLP), are compared to assess these models. The ML emulators are specifically trained to classify the state of mixing and predict three quantities of interest (QoIs) characterizing species production, decay, and degree of mixing. Linear classifiers and regressors fail to reproduce the QoIs; however, ensemble methods (classifiers and regressors) and the MLP accurately classify the state of reactive mixing and the QoIs. Among ensemble methods, random forest and decision-tree-based AdaBoost faithfully predict the QoIs. At run time, trained ML emulators are $\approx10^5$ times faster than the high-fidelity numerical simulations. Speed and accuracy of the ensemble and MLP models facilitate uncertainty quantification, which usually requires 1,000s of model run, to estimate the uncertainty bounds on the QoIs.

[1]  Andreas Müller,et al.  Introduction to Machine Learning with Python: A Guide for Data Scientists , 2016 .

[2]  S. James,et al.  A machine learning framework to forecast wave conditions , 2017, Coastal Engineering.

[3]  Yoav Freund,et al.  Game theory, on-line prediction and boosting , 1996, COLT '96.

[4]  Yue-Kin Tsang,et al.  Predicting the evolution of fast chemical reactions in chaotic flows. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[5]  Luca Martino,et al.  Physics-aware Gaussian processes in remote sensing , 2018, Appl. Soft Comput..

[6]  C. Humphreys,et al.  Machine Learning Predicts Laboratory Earthquakes , 2017, Geophysical Research Letters.

[7]  David J. C. MacKay,et al.  Bayesian Interpolation , 1992, Neural Computation.

[8]  Satish Karra,et al.  Robust system size reduction of discrete fracture networks: a multi-fidelity method that preserves transport characteristics , 2018, Computational Geosciences.

[9]  Khalid Rehman Hakeem,et al.  Plants, Pollutants and Remediation , 2015, Springer Netherlands.

[10]  Massimiliano Giona,et al.  A spectral approach to reaction/diffusion kinetics in chaotic flows , 2002 .

[11]  Harry Zhang,et al.  The Optimality of Naive Bayes , 2004, FLAIRS.

[12]  David Beamish,et al.  A machine learning approach to geochemical mapping , 2016 .

[13]  M. Cracknell,et al.  Mapping geology and volcanic-hosted massive sulfide alteration in the Hellyer–Mt Charter region, Tasmania, using Random Forests™ and Self-Organising Maps , 2014 .

[14]  Yoav Freund,et al.  Boosting a weak learning algorithm by majority , 1990, COLT '90.

[15]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[16]  Wojciech Samek,et al.  Methods for interpreting and understanding deep neural networks , 2017, Digit. Signal Process..

[17]  V. Lagneau,et al.  Industrial Deployment of Reactive Transport Simulation: An Application to Uranium In situ Recovery , 2019, Reviews in Mineralogy and Geochemistry.

[18]  V. Rodriguez-Galiano,et al.  Machine learning predictive models for mineral prospectivity: an evaluation of neural networks, random forest, regression trees and support vector machines , 2015 .

[19]  Li Chen,et al.  Pore scale study of multiphase multicomponent reactive transport during CO2 dissolution trapping , 2018 .

[20]  Bulbul Ahmmed Numerical modeling of CO2-water-rock interactions in the Farnsworth, Texas Hydrocarbon Unit, USA , 2015 .

[21]  Jui-Sheng Chou,et al.  Machine learning in concrete strength simulations: Multi-nation data analytics , 2014 .

[22]  J. Spijker,et al.  A supervised machine-learning approach towards geochemical predictive modelling in archaeology , 2015 .

[23]  M. K. Mudunuru,et al.  Physics-Informed Machine Learning Models for Predicting the Progress of Reactive-Mixing , 2019, Computer Methods in Applied Mechanics and Engineering.

[24]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[25]  Satish Karra,et al.  Advancing Graph‐Based Algorithms for Predicting Flow and Transport in Fractured Rock , 2018, Water Resources Research.

[26]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[27]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[28]  Satish Karra,et al.  PFLOTRAN User Manual A Massively Parallel Reactive Flow and Transport Model for Describing Surface and Subsurface Processes , 2015 .

[29]  Khaled Salah Mohamed,et al.  Machine Learning for Model Order Reduction , 2018 .

[30]  Paul A. Johnson,et al.  Similarity of fast and slow earthquakes illuminated by machine learning , 2018, Nature Geoscience.

[31]  Maarten V. de Hoop,et al.  Machine learning for data-driven discovery in solid Earth geoscience , 2019, Science.

[32]  David R. Karger,et al.  Tackling the Poor Assumptions of Naive Bayes Text Classifiers , 2003, ICML.

[33]  Maruti Kumar Mudunuru,et al.  A numerical framework for diffusion-controlled bimolecular-reactive systems to enforce maximum principles and the non-negative constraint , 2012, J. Comput. Phys..

[34]  S. A. Magana-Zook,et al.  Explosion Monitoring with Machine Learning: A LSTM Approach to Seismic Event Discrimination , 2017 .

[35]  V. Freedman,et al.  Reactive Transport in Porous Media , 2000 .

[36]  Bei Chen,et al.  Ensemble model aggregation using a computationally lightweight machine-learning model to forecast ocean waves , 2018, Journal of Marine Systems.

[37]  Donald W. Marquaridt Generalized Inverses, Ridge Regression, Biased Linear Estimation, and Nonlinear Estimation , 1970 .

[38]  Joachim Denzler,et al.  Predicting Landscapes as Seen from Space from Environmental Conditions , 2018, IGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium.

[39]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[40]  Maruti Kumar Mudunuru,et al.  On Local and Global Species Conservation Errors for Nonlinear Ecological Models and Chemical Reacting Flows , 2015 .

[41]  Andrew Reynen,et al.  Supervised machine learning on a network scale: application to seismic event classification and detection , 2017 .

[42]  Joachim Denzler,et al.  Deep learning and process understanding for data-driven Earth system science , 2019, Nature.

[43]  Steven L. Brunton,et al.  Data-Driven Science and Engineering , 2019 .

[44]  M. Rolle,et al.  Mixing and Reactive Fronts in the Subsurface , 2019, Reviews in Mineralogy and Geochemistry.

[45]  J. Corvisier,et al.  Multiphase Multicomponent Reactive Transport and Flow Modeling , 2019 .

[46]  Maruti Kumar Mudunuru,et al.  On mesh restrictions to satisfy comparison principles, maximum principles, and the non-negative constraint: Recent developments and new results , 2015, ArXiv.

[47]  C. Ayora,et al.  Acid Water–Rock–Cement Interaction and Multicomponent Reactive Transport Modeling , 2019, Reviews in Mineralogy and Geochemistry.

[48]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[49]  Andrew McCallum,et al.  A comparison of event models for naive bayes text classification , 1998, AAAI 1998.

[50]  Rahim Barzegar,et al.  Mapping groundwater contamination risk of multiple aquifers using multi-model ensemble of machine learning algorithms. , 2018, The Science of the total environment.

[51]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[52]  Gene H. Golub,et al.  Algorithms for Computing the Sample Variance: Analysis and Recommendations , 1983 .

[53]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[54]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[55]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[56]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[57]  Maruti Kumar Mudunuru,et al.  On enforcing maximum principles and achieving element-wise species balance for advection-diffusion-reaction equations under the finite element method , 2015, J. Comput. Phys..

[58]  Sonja Kuhnt,et al.  Design and analysis of computer experiments , 2010 .

[59]  Satish Karra,et al.  Unsupervised Machine Learning Based on Non-Negative Tensor Factorization for Analyzing Reactive-Mixing , 2018, J. Comput. Phys..

[60]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[61]  Maruti Kumar Mudunuru,et al.  A framework for coupled deformation–diffusion analysis with application to degradation/healing , 2011, ArXiv.

[62]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[63]  Ashutosh Kumar Singh,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2010 .

[64]  Zheng Zhou,et al.  Seismic-Net: A Deep Densely Connected Neural Network to Detect Seismic Events , 2018, ArXiv.

[65]  George Eastman House,et al.  Sparse Bayesian Learning and the Relevan e Ve tor Ma hine , 2001 .

[66]  R. Zuo Machine Learning of Mineralization-Related Geochemical Anomalies: A Review of Potential Methods , 2017, Natural Resources Research.

[67]  Gilles Louppe,et al.  Independent consultant , 2013 .

[68]  Satish Karra,et al.  Using Machine Learning to Discern Eruption in Noisy Environments: A Case Study using CO2-driven Cold-Water Geyser in Chimayo, New Mexico , 2018, Seismological Research Letters.

[69]  S. Molins,et al.  Multiscale Approaches in Reactive Transport Modeling , 2019, Reviews in Mineralogy and Geochemistry.

[70]  Scott C. James,et al.  An integrated framework that combines machine learning and numerical models to improve wave-condition forecasts , 2018, Journal of Marine Systems.