Passive sonar automated target classifier for shallow waters using end-to-end learnable deep convolutional LSTMs

Abstract Automated target recognition systems are increasingly employed in sonar systems to reduce manning and associated challenges. Although passive acoustic target recognition is an exceptionally challenging endeavor especially in shallow water scenarios, it is being used by naval forces of the world by virtue of its inherent advantages compared to the alternatives. In order to address these challenges as well as to exploit the latent and subtle features in the signal stream from the hydrophones, an end-to-end differentiable architecture is proposed in this paper. Here the key strategy is to rely on the data, instead of relying on the prior knowledge about the data. The raw acoustic signals from the hydrophones are directly fed to a pre-initialized 1-dimensional convolutional layer followed by a cascade of 2-dimensional convolutional spectro-temporal feature learners. Various auditory scales are used for pre-initializing, so as to emphasize the frequencies of interest. In order to better capture the temporal relations, a Bidirectional-LSTM layer with a trainable attention module is employed. The best configuration of the proposed classifier system yields an accuracy of 95.2% on a large acoustic dataset, collected from the shallows of the Indian ocean.

[1]  Roger Temam,et al.  Low-Frequency Variability in Shallow-Water Models of the Wind-Driven Ocean Circulation. Part II: Time-Dependent Solutions* , 2003 .

[2]  Xinhua Zhang,et al.  Recognition of Radiated Noises of Ships Using Auditory Features and Support Vector Machines , 2005, ISNN.

[3]  H. Ali,et al.  Oceanographic variability in shallow-water acoustics and the dual role of the sea bottom , 1993 .

[4]  Yonghong Yan,et al.  Underwater target classification using deep learning , 2018, OCEANS 2018 MTS/IEEE Charleston.

[5]  Min Song,et al.  A Method of Underwater Acoustic Signal Classification Based on Deep Neural Network , 2018, 2018 5th International Conference on Information Science and Control Engineering (ICISCE).

[6]  G. Grelowska,et al.  Underwater Noise Generated by a Small Ship in the Shallow Sea , 2013 .

[7]  Shipping Low Frequency Noise and Its Propagation in Shallow Water , 2011 .

[8]  Malcolm Slaney,et al.  An Efficient Implementation of the Patterson-Holdsworth Auditory Filter Bank , 1997 .

[9]  William A. Kuperman,et al.  Shallow-Water Acoustics , 2004 .

[10]  Tuomas Virtanen,et al.  Filterbank learning for deep neural network based polyphonic sound event detection , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[11]  Ira Dyer,et al.  Sonar Performance Predictions Incorporating Environmental Variability , 2002 .

[12]  Jiawei Ren,et al.  Feature Analysis of Passive Underwater Targets Recognition Based on Deep Neural Network , 2019, OCEANS 2019 - Marseille.

[13]  Iasonas Kokkinos,et al.  Learning Filterbanks from Raw Speech for Phone Recognition , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14]  Tara N. Sainath,et al.  Learning filter banks within a deep neural network framework , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[15]  Arnab Das Shallow ambient noise variability due to distant shipping noise and tide , 2011 .

[16]  Sheng Shen,et al.  Auditory Inspired Convolutional Neural Networks for Ship Type Classification with Raw Hydrophone Data , 2018, Entropy.

[17]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[18]  Juhan Nam,et al.  SampleCNN: End-to-End Deep Convolutional Neural Networks Using Very Small Filters for Music Classification , 2018 .

[19]  Ron J. Weiss,et al.  Speech acoustic modeling from raw multichannel waveforms , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[20]  Håkon Gimse,et al.  Classification of Marine Vessels using Sonar Data and a Neural Network , 2017 .

[21]  Bülent Yener,et al.  Learning filter widths of spectral decompositions with wavelets , 2018, NeurIPS.

[22]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  R. Patterson,et al.  Complex Sounds and Auditory Images , 1992 .

[24]  Sheng Shen,et al.  Ship Type Classification by Convolutional Neural Networks with Auditory-Like Mechanisms , 2020, Sensors.

[25]  Xu Cao,et al.  Underwater Target Classification Using Deep Neural Network , 2018, 2018 OCEANS - MTS/IEEE Kobe Techno-Oceans (OTO).

[26]  Sriram Ganapathy,et al.  Unsupervised Raw Waveform Representation Learning for ASR , 2019, INTERSPEECH.

[27]  S. Tucker,et al.  Classification of transient sonar sounds using perceptually motivated features , 2005, IEEE Journal of Oceanic Engineering.

[28]  Sheng Shen,et al.  Improved Auditory Inspired Convolutional Neural Networks for Ship Type Classification , 2019, OCEANS 2019 - Marseille.

[29]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[30]  G. Latha,et al.  Wind dependence of ambient noise in shallow water of Bay of Bengal , 2008 .

[31]  Hermann Ney,et al.  Acoustic modeling with deep neural networks using raw time signal for LVCSR , 2014, INTERSPEECH.

[32]  E. Zwicker,et al.  Subdivision of the audible frequency range into critical bands , 1961 .

[33]  S. S. Stevens,et al.  The Relation of Pitch to Frequency: A Revised Scale , 1940 .

[34]  Brian R Glasberg,et al.  Derivation of auditory filter shapes from notched-noise data , 1990, Hearing Research.

[35]  Steve Renals,et al.  On Learning Interpretable CNNs with Parametric Modulated Kernel-Based Filters , 2019, INTERSPEECH.

[36]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[37]  Gang Hu,et al.  Deep Learning Methods for Underwater Target Feature Extraction and Recognition , 2018, Comput. Intell. Neurosci..

[38]  Jürgen Schmidhuber,et al.  Framewise phoneme classification with bidirectional LSTM and other neural network architectures , 2005, Neural Networks.

[39]  Yoshua Bengio,et al.  Speaker Recognition from Raw Waveform with SincNet , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).

[40]  Chen Wang,et al.  Competitive Deep-Belief Networks for Underwater Acoustic Target Recognition , 2018, Sensors.

[41]  Thomas Quatieri,et al.  Discrete-Time Speech Signal Processing: Principles and Practice , 2001 .

[42]  Sheng Shen,et al.  A Deep Convolutional Neural Network Inspired by Auditory Perception for Underwater Acoustic Target Recognition , 2019, Sensors.