Dialect-robust Evaluation of Generated Text