Toward More Effective Human Evaluation for Machine Translation