DOAD: An Online Dredging Operation Anomaly Detection Method based on AIS Data

Dredging is the removal of sediment from the bottom of lakes, rivers, harbors, and other water bodies. It is a routine necessity in waterways around the world because the natural process of sand and silt washing downstream results in the sediment gradually filling channels and harbors. However, during the dredging operation, some dredgers do not transport the sediment to the designated area as expected but throw it near the waterway meaning sediment may return to the waterway in a short period. This paper proposes an online dredging operation anomaly detection (DOAD) method to detect this kind of irregular behavior during the dredging operation based on automatic identification system (AIS) data. First, we establish a feature system to extract behavior features from AIS data. Furthermore, we jointly utilize t-distributed stochastic neighbor embedding (t-SNE) with neural networks and a Gaussian mixture model (GMM) to train a detection model in a semi-supervised way. Through the trained model, irregular behaviors can be efficiently detected in real time during the dredging operation. The effectiveness of DOAD is evaluated according to a series of experiments. To the best of our knowledge of the published literature, this work is the first to introduce the application of AIS data to detect irregular behaviors during dredging operations.

[1]  Hans-Peter Kriegel,et al.  Angle-based outlier detection in high-dimensional data , 2008, KDD.

[2]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[3]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[4]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[5]  Avrim Blum,et al.  Learning from Labeled and Unlabeled Data using Graph Mincuts , 2001, ICML.

[6]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[7]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[8]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[9]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[10]  David A. Landgrebe,et al.  The effect of unlabeled samples in reducing the small sample size problem and mitigating the Hughes phenomenon , 1994, IEEE Trans. Geosci. Remote. Sens..

[11]  M. Shyu,et al.  A Novel Anomaly Detection Scheme Based on Principal Component Classifier , 2003 .

[12]  Zhi-Hua Zhou,et al.  Isolation Forest , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[13]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[14]  Laurens van der Maaten,et al.  Learning a Parametric Embedding by Preserving Local Structure , 2009, AISTATS.

[15]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[16]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[17]  Andreas Dengel,et al.  Histogram-based Outlier Score (HBOS): A fast Unsupervised Anomaly Detection Algorithm , 2012 .