Unsupervised Machine Learning Based on Non-Negative Tensor Factorization for Analyzing Reactive-Mixing

Abstract Analysis of reactive-diffusion simulations representing complex mixing processes requires a large number of independent model runs. For each high-fidelity model simulation, the model inputs are varied and the predicted mixing behavior is represented by temporal and spatial changes in species concentration. It is then required to discern how the model inputs (such as diffusivity, dispersion, anisotropy, and velocity field properties) impact the mixing process. This task is challenging and typically involves interpretation of large model outputs representing temporal and spatial changes of species concentration within the model domain. However, the task can be automated and substantially simplified by applying Machine Learning (ML) methods. In this paper, we present an application of an unsupervised ML method (called NTF k ) using Non-negative Tensor Factorization (NTF) coupled with a custom clustering procedure based on k -means to reveal the temporal and spatial features in product concentrations. An attractive and unique aspect of the proposed ML method is that it ensures the extracted features are non-negative, which is important to obtain a meaningful deconstruction of the mixing processes. The ML methodology is applied to a large set of high-resolution finite-element model simulations representing anisotropic reaction-diffusion processes in perturbed vortex-based velocity fields. The applied finite-element method ensures that spatial and temporal species concentration are always non-negative, even in the case of high anisotropic contrasts. The simulated reaction is a fast irreversible bimolecular reaction A + B → C , where species A and B react to form species C . The reactive-diffusion model input parameters that control mixing include properties of the velocity field (such as vortex structures), anisotropic dispersion, and molecular diffusion. We demonstrate the applicability of the ML feature extraction method to produce a meaningful deconstruction of model outputs to discriminate between different physical processes impacting the reactants, their mixing, and the spatial distribution of the product C . The presented ML analysis allowed us to identify additive temporal and spatial features that characterize mixing behavior. The application of the proposed NTF k approach is not limited to reactive-mixing. NTF k can be readily applied to any observed or simulated datasets that can be represented as tensors (multi-dimensional arrays) and have separable latent signatures or features.

[1]  Alan Edelman,et al.  Julia: A Fast Dynamic Language for Technical Computing , 2012, ArXiv.

[2]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[3]  Charles J Werth,et al.  Evaluation of the effects of porous media structure on mixing-controlled reactions using pore-scale modeling and micromodel experiments. , 2008, Environmental science & technology.

[4]  Adel Hamdi,et al.  Inverse source problem in a one-dimensional evolution linear transport equation with spatially varying coefficients: application to surface water pollution , 2013 .

[5]  V. Hessel,et al.  Micromixers—a review on passive and active mixing principles , 2005 .

[6]  Filip L Iliev,et al.  Nonnegative Matrix Factorization for identification of unknown number of sources emitting delayed signals , 2016, PloS one.

[7]  Brian J. Wagner,et al.  Simultaneous parameter estimation and contaminant source characterization for coupled groundwater flow and contaminant transport modelling , 1992 .

[8]  Stephen Wiggins,et al.  Introduction: mixing in microfluidics , 2004, Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[9]  Satish Karra,et al.  Large-Scale Optimization-Based Non-negative Computational Framework for Diffusion Equations: Parallel Implementation and Performance Studies , 2015, J. Sci. Comput..

[10]  W. J. Deutsch Groundwater Geochemistry: Fundamentals and Applications to Contamination , 1997 .

[11]  Coleman duP. Donaldson,et al.  Effect of inhomogeneous mixing on atmospheric photochemical reactions , 1972 .

[12]  P. Paatero,et al.  Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values† , 1994 .

[13]  J. J. Morgan,et al.  Aquatic Chemistry: Chemical Equilibria and Rates in Natural Waters , 1970 .

[14]  Dieter M. Imboden,et al.  Mixing Mechanisms in Lakes , 1995 .

[15]  Lutz Tobiska,et al.  Numerical Methods for Singularly Perturbed Differential Equations , 1996 .

[16]  Maruti Kumar Mudunuru,et al.  A numerical framework for diffusion-controlled bimolecular-reactive systems to enforce maximum principles and the non-negative constraint , 2012, J. Comput. Phys..

[17]  Jongrae Kim,et al.  Computationally Efficient Modelling of Stochastic Spatio-Temporal Dynamics in Biomolecular Networks , 2018, Scientific Reports.

[18]  D. O'Malley,et al.  Contaminant source identification using semi-supervised machine learning. , 2017, Journal of contaminant hydrology.

[19]  F. L. Hitchcock The Expression of a Tensor or a Polyadic as a Sum of Products , 1927 .

[20]  P. Kitanidis,et al.  Estimation of historical groundwater contaminant distribution using the adjoint state method applied to geostatistical inverse modeling , 2004 .

[21]  Melanie Grunwald Numerical Bifurcation Analysis For Reaction Diffusion Equations , 2016 .

[22]  José Manuel Amigo,et al.  Unsupervised pattern-recognition techniques to investigate metal pollution in estuaries , 2013 .

[23]  Richard S. Zemel,et al.  Learning Parts-Based Representations of Data , 2006, J. Mach. Learn. Res..

[24]  J. Ottino Mixing, chaotic advection, and turbulence , 1990 .

[25]  El-Mehdi Hamzaoui,et al.  Application of Nonnegative Tensor Factorization for neutron-gamma discrimination of Monte Carlo simulated fission chamber’s output signals , 2017 .

[26]  L. Tucker,et al.  Some mathematical notes on three-mode factor analysis , 1966, Psychometrika.

[27]  Graciela M. Escandar,et al.  Parallel Factor Analysis: Trilinear Data , 2014 .

[28]  R. Jackson,et al.  A critical review of the risks to water resources from unconventional shale gas development and hydraulic fracturing in the United States. , 2014, Environmental science & technology.

[29]  Mustafa M. Aral,et al.  Identification of Contaminant Sources in Water Distribution Systems Using Simulation-Optimization Method: Case Study , 2006 .

[30]  J. Imberger,et al.  On the nature of turbulence in a stratified fluid , 1991 .

[31]  Maruti Kumar Mudunuru,et al.  On enforcing maximum principles and achieving element-wise species balance for advection-diffusion-reaction equations under the finite element method , 2015, J. Comput. Phys..

[32]  Lars Kai Hansen,et al.  Algorithms for Sparse Nonnegative Tucker Decompositions , 2008, Neural Computation.

[33]  Martin Stynes,et al.  Numerical methods for convection-diffusion problems or The 30 years war , 2013, 1306.5172.

[34]  Maruti Kumar Mudunuru,et al.  On mesh restrictions to satisfy comparison principles, maximum principles, and the non-negative constraint: Recent developments and new results , 2015, ArXiv.

[35]  Yen-Hsi Richard Tsai,et al.  Point source identification in nonlinear advection–diffusion–reaction systems , 2012, 1202.2373.

[36]  Jörg Imberger,et al.  On the Nature of Turbulence in a Stratified Fluid. Part I: The Energetics of Mixing , 1991 .

[37]  Wotao Yin,et al.  A Block Coordinate Descent Method for Regularized Multiconvex Optimization with Applications to Nonnegative Tensor Factorization and Completion , 2013, SIAM J. Imaging Sci..

[38]  V. Borukhov,et al.  Identification of a time-dependent source term in nonlinear hyperbolic or parabolic heat equation , 2015 .

[39]  Pier Luigi Dragotti,et al.  Spatio-temporal sampling and reconstruction of diffusion fields induced by point sources , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[40]  L. Gelhar Stochastic Subsurface Hydrology , 1992 .

[41]  J. Szmelter Incompressible flow and the finite element method , 2001 .

[42]  Yue-Kin Tsang,et al.  Predicting the evolution of fast chemical reactions in chaotic flows. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[43]  Velimir V. Vesselinov,et al.  Blind source separation for groundwater pressure analysis based on nonnegative matrix factorization , 2014 .

[44]  Massimiliano Giona,et al.  A spectral approach to reaction/diffusion kinetics in chaotic flows , 2002 .

[45]  Maruti Kumar Mudunuru,et al.  On Local and Global Species Conservation Errors for Nonlinear Ecological Models and Chemical Reacting Flows , 2015 .

[46]  Andrzej Cichocki,et al.  Nonnegative Matrix and Tensor Factorization T , 2007 .

[47]  Velimir V. Vesselinov,et al.  Identification of release sources in advection–diffusion system by machine learning combined with Green’s function inverse method , 2018, Applied Mathematical Modelling.

[48]  Danna Zhou,et al.  d. , 1934, Microbial pathogenesis.

[50]  R. Harshman,et al.  PARAFAC: parallel factor analysis , 1994 .

[51]  Andrzej Cichocki,et al.  A New Learning Algorithm for Blind Signal Separation , 1995, NIPS.

[52]  I. Jolliffe Principal Component Analysis and Factor Analysis , 1986 .

[53]  Yangyang Xu,et al.  Alternating proximal gradient method for sparse nonnegative Tucker decomposition , 2013, Mathematical Programming Computation.

[54]  Brian Borchers,et al.  Comparison of inverse methods for reconstructing the release history of a groundwater contamination source , 2000 .

[55]  Rasmus Bro,et al.  The N-way Toolbox for MATLAB , 2000 .

[56]  Carl de Boor,et al.  A Practical Guide to Splines , 1978, Applied Mathematical Sciences.

[57]  Amvrossios C. Bagtzoglou,et al.  Pollution source identification in heterogeneous porous media , 2001 .

[58]  G. Whitesides,et al.  Experimental and theoretical scaling laws for transverse diffusive broadening in two-phase laminar flows in microchannels , 2000 .

[59]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[60]  Joos Vandewalle,et al.  A Multilinear Singular Value Decomposition , 2000, SIAM J. Matrix Anal. Appl..

[61]  Tamara G. Kolda,et al.  Parallel Tensor Compression for Large-Scale Scientific Data , 2015, 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[62]  Simon Haykin,et al.  The Cocktail Party Problem , 2005, Neural Computation.