Video Captioning Using Global-Local Representation