An Interpretable Aid Decision-Making Model for Flag State Control Ship Detention Based on SMOTE and XGBoost

The reasonable decision of ship detention plays a vital role in flag state control (FSC). Machine learning algorithms can be applied as aid tools for identifying ship detention. In this study, we propose a novel interpretable ship detention decision-making model based on machine learning, termed SMOTE-XGBoost-Ship detention model (SMO-XGB-SD), using the extreme gradient boosting (XGBoost) algorithm and the synthetic minority oversampling technique (SMOTE) algorithm to identify whether a ship should be detained. Our verification results show that the SMO-XGB-SD algorithm outperforms random forest (RF), support vector machine (SVM), and logistic regression (LR) algorithm. In addition, the new algorithm also provides a reasonable interpretation of model performance and highlights the most important features for identifying ship detention using the Shapley additive explanations (SHAP) algorithm. The SMO-XGB-SD model provides an effective basis for aiding decisions on ship detention by inland flag state control officers (FSCOs) and the ship safety management of ship operating companies, as well as training services for new FSCOs in maritime organizations.

[1]  Shilei Lu,et al.  Multi-criteria comprehensive study on predictive algorithm of hourly heating energy consumption for residential buildings , 2019, Sustainable Cities and Society.

[2]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[3]  Li Zhang,et al.  Hybrid decision tree and naïve Bayes classifiers for multi-class classification tasks , 2014, Expert Syst. Appl..

[4]  Jiangning Song,et al.  An Interpretable Prediction Model for Identifying N7-Methylguanosine Sites Based on XGBoost and SHAP , 2020, Molecular therapy. Nucleic acids.

[5]  Gil-Soo Kim,et al.  Forecasting Model for Korean Ships' Detention in Port State Control , 2008 .

[6]  Jie Yang,et al.  Prediction of ship collision risk based on CART , 2018, IET Intelligent Transport Systems.

[7]  Ali Movahedi,et al.  Toward safer highways, application of XGBoost and SHAP for real-time accident detection and feature analysis. , 2019, Accident; analysis and prevention.

[8]  Pierre Cariou,et al.  Evidence on target factors used for port state control inspections , 2009 .

[9]  Xinqiang Chen,et al.  Ship Detention Situation Prediction via Optimized Analytic Hierarchy Process and Naïve Bayes Model , 2020 .

[10]  Zheng Wan,et al.  Identification of key factors of ship detention under Port State Control , 2019, Marine Policy.

[11]  Lior Rokach,et al.  Ensemble learning: A survey , 2018, WIREs Data Mining Knowl. Discov..

[12]  Sergio Escalera,et al.  Beyond One-hot Encoding: lower dimensional target embedding , 2018, Image Vis. Comput..

[13]  Stuart L. Crawford Extensions to the CART Algorithm , 1989, Int. J. Man Mach. Stud..

[14]  W. F. Rocha,et al.  Determination of physicochemical properties of petroleum derivatives and biodiesel using GC/MS and chemometric methods with uncertainty estimation. , 2019, Fuel.

[15]  Ming-Cheng Tsou,et al.  Big data analysis of port state control ship detention database , 2018, Journal of Marine Engineering & Technology.

[16]  Christopher D. Brown,et al.  Receiver operating characteristics curves and related decision measures: A tutorial , 2006 .

[17]  Yifeng Wang,et al.  The distinguishing intrinsic brain circuitry in treatment-naïve first-episode schizophrenia: Ensemble learning classification , 2019, Neurocomputing.

[18]  Yongsheng Liu,et al.  iDTi-CSsmoteB: Identification of Drug–Target Interaction Based on Drug Chemical Structure and Protein Sequence Using XGBoost With Over-Sampling Technique SMOTE , 2019, IEEE Access.

[19]  Zhisen Yang,et al.  Realising advanced risk-based port state control inspection using data-driven Bayesian networks , 2018 .

[20]  Shutao Wang,et al.  A new method of diesel fuel brands identification: SMOTE oversampling combined with XGBoost ensemble learning , 2020 .

[21]  G-Y Gan,et al.  Performance evaluation of the security management of Changjiang Maritime Safety Administrations: application with undesirable outputs in data envelopment analysis , 2017 .

[22]  Pierre Cariou,et al.  An econometric analysis of deficiencies noted in port state control inspections , 2007 .

[23]  Kum Fai Yuen,et al.  Oil tanker risks on the marine environment: An empirical study and policy implications , 2019, Marine Policy.

[24]  H. Cai,et al.  Evaluation of SVM, ELM and four tree-based ensemble models for predicting daily reference evapotranspiration using limited meteorological data in different climates of China , 2018, Agricultural and Forest Meteorology.

[25]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[26]  Rui Ferreira Neves,et al.  Combining Principal Component Analysis, Discrete Wavelet Transform and XGBoost to trade in the financial markets , 2019, Expert Syst. Appl..

[27]  Jun Ma,et al.  Analyzing the Leading Causes of Traffic Fatalities Using XGBoost and Grid-Based Analysis: A City Management Perspective , 2019, IEEE Access.