Minimum Divergence vs. Maximum Margin: an Empirical Comparison on Seq2Seq Models