Unsupervised learning for seabed type and source parameters from surface ship spectrograms

Lack of labeled field data in ocean acoustics is a limitation for applying supervised machine learning techniques. Unsupervised techniques have shown potential for classifying signals, but questions remain as to how much additional information unsupervised techniques can learn from ocean acoustic data. This paper evaluates the ability of an unsupervised k-mean clustering algorithm to provide information about the seabed and location of a moving source. The training data are synthetic spectrograms from surface ships based on a wide variety of ship speeds, closest distance to the hydrophone, and seabed types. The resulting clusters are evaluated to discover trends into which types of spectrograms are in each cluster, and the characteristics of the cluster centroids are investigated. These studies highlight the speed-distance ambiguity seen in ship spectrograms as well as the ambiguity between distance and bottom loss of the seabed. The trained algorithm is tested on both synthetic spectrograms and ones from merchant and cargo ships recorded during the Seabed Characterization Experiment in 2017. The future of unsupervised machine learning in ocean acoustics depends on correctly interpreting the results and testing the ability of trained networks to generalize.