When a mixed ensemble sings a common song: Spatial grouping from temporal structure