Using Text Mining in Official Statistics

There is a tremendous increase in the number of actors in the statistical arena in terms of producers, distributors, and users due to the new options of the web technology. These actors are not sufficiently informed about the technological progress made in the field of text mining and the ways in which they can benefit from these. The NEMIS project, and especially its Working Group 5, aims to identify possible applications of text mining in the world of production and dissemination of official statistics. Examples of such applications might be advanced querying of document warehouses at websites, analysing, processing and coding the answers to open-ended questions in questionnaire data, sophisticated access to internal and external sources of statistical metainformation, or to “pull” statistical data and metadata from the web sites of sending institutions.