Abnormal Scene Classification using Image Captioning Technique: A Landslide Case Study

Disaster occurs in various regions, which affects human life and activities. For instance, the one disaster we faced in the mountainous area was a landslide that could occur for various reasons, such as heavy precipitation, failure of slope stability, or earthquake. After this event, humans must evaluate the damage and the damaged area to recover it. Therefore, we must focus on autonomous damage detection to evaluate the damaged region and plan for the decision-maker for recovery. This research discusses a new technique for detecting dam-age in disaster situations, specifically landslides. The technique combines a transformer model for object detection and a vision encoder-decoder for image captioning. By using language to detect abnormal scenes rather than just visual features, the proposed method can accurately predict the extent of damage in the affected region. The results of this study show that the proposed method outperforms the traditional ResNet50 classification method in terms of accuracy, AUC, precision, recall, and F1 score. These findings suggest that the proposed technique could be an effective tool for evaluating the scope of damage after a disaster, which could help the recovery process.

[1]  J. Yoneyama,et al.  Real-time obstacle detection in a darkroom using a monocular camera and a line laser , 2022, Artificial Life and Robotics.

[2]  V. Banks,et al.  A Near-Real-Time Global Landslide Incident Reporting Tool Demonstrator Using Social Media and Artificial Intelligence , 2022, SSRN Electronic Journal.

[3]  Ramesh P. Singh,et al.  Landslide detection in the Himalayas using machine learning algorithms and U-Net , 2022, Landslides.

[4]  V. Banks,et al.  A Real-time System for Detecting Landslide Reports on Social Media using Artificial Intelligence , 2022, ICWE.

[5]  Qiang Xu,et al.  Detection and segmentation of loess landslides via satellite images: a two-phase framework , 2022, Landslides.

[6]  V. Banks,et al.  Landslide detection in real-time social media image streams , 2021, Neural Computing and Applications.

[7]  Cha Zhang,et al.  TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models , 2021, AAAI.

[8]  Ken T. Murata,et al.  Study on Combining Two Faster R-CNN Models for Landslide Detection with a Classification Decision Tree to Improve the Detection Performance , 2021 .

[9]  Pascal Fua,et al.  SegmentMeIfYouCan: A Benchmark for Anomaly Segmentation , 2021, NeurIPS Datasets and Benchmarks.

[10]  Roland Siegwart,et al.  Pixel-wise Anomaly Detection in Complex Driving Scenes , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  S. Gelly,et al.  An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.

[12]  Lucas P. Soares,et al.  Landslide Segmentation with U-Net: Evaluating Different Sampling Methods and Patch Sizes , 2020, ArXiv.

[13]  Peng Liu,et al.  Research on Post-Earthquake Landslide Extraction Algorithm Based on Improved U-Net Model , 2020, Remote. Sens..

[14]  知秀 柴田 5分で分かる!? 有名論文ナナメ読み:Jacob Devlin et al. : BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding , 2020 .

[15]  Jason J. Corso,et al.  Unified Vision-Language Pre-Training for Image Captioning and VQA , 2019, AAAI.

[16]  Aliaksei Severyn,et al.  Leveraging Pre-trained Checkpoints for Sequence Generation Tasks , 2019, Transactions of the Association for Computational Linguistics.

[17]  Candan Gokceoglu,et al.  A Convolutional Neural Network Architecture for Auto-Detection of Landslide Photographs to Assess Citizen Science and Volunteered Geographic Information Data Quality , 2019, ISPRS Int. J. Geo Inf..

[18]  Guezouli Larbi,et al.  Road obstacle detection , 2019, Proceedings of the 3rd International Conference on Future Networks and Distributed Systems.

[19]  Pascal Fua,et al.  Detecting the Unexpected via Image Resynthesis , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[20]  M. Sardoğan,et al.  Plant Leaf Disease Detection and Classification Based on CNN with LVQ Algorithm , 2018, 2018 3rd International Conference on Computer Science and Engineering (UBMK).

[21]  Lixiang Li,et al.  Captioning Transformer with Stacked Attention Modules , 2018 .

[22]  Patrik Kamencay,et al.  Animal recognition system based on convolutional neural network , 2017 .

[23]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[24]  Georg Langs,et al.  Unsupervised Anomaly Detection with Generative Adversarial Networks to Guide Marker Discovery , 2017, IPMI.

[25]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[26]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[27]  H. A. Nefeslioglu,et al.  Landslide susceptibility mapping for a part of tectonic Kelkit Valley (Eastern Black Sea region of Turkey) , 2008 .

[28]  K. Horiguchi,et al.  Road Obstacle Detection Method Based on an Autoencoder with Semantic Segmentation , 2020, ACCV.

[29]  Patel Dhruv,et al.  Image Classification Using Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN): A Review , 2020 .

[30]  Gavriel Salomon,et al.  T RANSFER OF LEARNING , 1992 .