Can oracle-based imitation learning improve neural machine translation with data aggregation?