Bootstrapping Audio-Visual Segmentation by Strengthening Audio Cues