A practical guide for generating unsupervised, spectrogram-based latent space representations of animal vocalizations.