Boosting Audio-visual Zero-shot Learning with Large Language Models