Adaptively Aligned Image Captioning via Adaptive Attention Time