Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis