Towards Better NLP System Evaluation

This paper considers key elements of evaluation methodology, indicating the many points involved and advocating an unpacking approach in specifying an evaluation remit and design. Recognising the importance of both environment variables and system parameters leads to a grid organisation for tests. The paper illustrates the application of these notions through two examples.