Evaluating factual accuracy in complex data-to-text