Frequent subgraph mining in oceanographic multi-level directed graphs

ABSTRACT We present an adaptation and application of frequent subgraph mining (FSM) in a time series of spatial multi-level directed graphs depicting probabilistic transitions of water masses between neighboring sea areas within a given time interval. The directed graphs are created from the results of the numerical model, the Mediterranean Ocean Forecasting System. We assign unique labels (geographical locations) to vertices of the multi-level directed graphs. Then, we add the edge labels as discretized values of the probabilities of transitions between vertices. This modification allows the use of the established algorithm gSpan to search for frequently directed subgraphs in the sequence of such directed graphs. Thus, we obtain both general and specific subgraphs, such as convergences, divergences, and paths of the ocean currents in the numerical model. The resulting substructures, revealed by directed subgraphs, match oceanographic structures (gyres, convergences/divergences, and paths) deduced from field observations, and can also serve as a tool for the validation of the numerical model of circulation in the sea.

[1]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[2]  Igor Kononenko,et al.  Dynamic fuzzy paths and cycles in multi-level directed graphs , 2015, Eng. Appl. Artif. Intell..

[3]  Annalisa Griffa,et al.  Lagrangian turbulence in the Adriatic Sea as computed from drifter data: Effects of inhomogeneity and nonstationarity , 2003, physics/0309073.

[4]  George Haller,et al.  Geometry of Cross-Stream Mixing in a Double-Gyre Ocean Model , 1999 .

[5]  Hailong Wang,et al.  An overview of fuzzy Description Logics for the Semantic Web , 2012, The Knowledge Engineering Review.

[6]  M. Blokhina,et al.  Baroclinic instability and transient features of mesoscale surface circulation in the Black Sea: Laboratory experiment , 2003 .

[7]  Jiong Yang,et al.  SPIN: mining maximal frequent subgraphs from graph databases , 2004, KDD.

[8]  Joost N. Kok,et al.  A quickstart in frequent structure mining can make a difference , 2004, KDD.

[9]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[10]  Christian Borgelt,et al.  Mining molecular fragments: finding relevant substructures of molecules , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[11]  B. Cushman-Roisin,et al.  Introduction to geophysical fluid dynamics : physical and numerical aspects , 2011 .

[12]  Phokion G. Kolaitis,et al.  The complexity of mining maximal frequent subgraphs , 2013, PODS '13.

[13]  Benoit Cushman-Roisin,et al.  Mesoscale‐resolving simulations of summer and winter bora events in the Adriatic Sea , 2007 .

[14]  David E. Dietrich,et al.  Simulation and characterization of the Adriatic Sea mesoscale variability , 2007 .

[15]  Igor Kononenko,et al.  Multi-level association rules and directed graphs for spatial data analysis , 2013, Expert Syst. Appl..

[16]  Takashi Washio,et al.  An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data , 2000, PKDD.

[17]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[18]  Aniello Russo,et al.  The Adriatic Sea General Circulation. Part II: Baroclinic Circulation Structure , 1997 .

[19]  Joost N. Kok,et al.  Faster Association Rules for Multiple Relations , 2001, IJCAI.

[20]  Annalisa Griffa,et al.  Lagrangian Analysis and Prediction of Coastal and Ocean Dynamics: Predictability of Lagrangian motion in the upper ocean , 2007 .

[21]  G. Mellor Introduction to physical oceanography , 1996 .

[22]  Jure Cedilnik,et al.  Modeling the ocean and atmosphere during an extreme bora event in northern Adriatic using one-way and two-way atmosphere–ocean coupling , 2016 .

[23]  N. Pinardi,et al.  The Adriatic Sea modelling system: a nested approach , 2003 .

[24]  G. Haller Finding finite-time invariant manifolds in two-dimensional velocity fields. , 2000, Chaos.

[25]  A. Grandi,et al.  The Mediterranean ocean Forecasting System , 2008 .

[26]  Lawrence B. Holder,et al.  Subdue: compression-based frequent pattern discovery in graph data , 2005 .

[27]  Wei Wang,et al.  Efficient mining of frequent subgraphs in the presence of isomorphism , 2003, Third IEEE International Conference on Data Mining.

[28]  K. Lakshmi,et al.  FREQUENT SUBGRAPH MINING ALGORITHMS - A SURVEY AND FRAMEWORK FOR CLASSIFICATION , 2012, ICIT 2012.

[29]  Nagiza F. Samatova,et al.  Practical Graph Mining with R , 2013 .

[30]  J. Marsden,et al.  Definition and properties of Lagrangian coherent structures from finite-time Lyapunov exponents in two-dimensional aperiodic flows , 2005 .

[31]  K. Lakshmi,et al.  Efficient Algorithm for Mining Frequent Subgraphs (Static and Dynamic) based on gSpan , 2013 .

[32]  Richard P. Signell,et al.  Surface drifter derived circulation in the northern and middle Adriatic Sea: Response to wind regime and season , 2007 .

[33]  Frans Coenen,et al.  A survey of frequent subgraph mining algorithms , 2012, The Knowledge Engineering Review.

[34]  Marina Tonani,et al.  under a Creative Commons License. Ocean Science A high-resolution free-surface model of the Mediterranean Sea , 2007 .

[35]  Malcolm J. Bowman,et al.  Rim current and coastal eddy mechanisms in an eddy-resolving Black Sea general circulation model , 2001 .

[36]  Salvador Balle,et al.  An eddy tracking algorithm based on dynamical systems theory , 2016, Ocean Dynamics.