Towards Document-Level Human MT Evaluation: On the Issues of Annotator Agreement, Effort and Misevaluation