Improving Audio-Language Learning with MixGen and Multi-Level Test-Time Augmentation