Learning Sound Representations Using Triplet-loss