Towards Multi-Scale Style Control for Expressive Speech Synthesis