Using ‘new’ data sources for ‘old’ newspaper research: Developing guidelines for data collection

Abstract This article discusses the benefits and limitations of collecting electronic data for large-scale thematic content analysis. We will discuss a number of methodological and technical issues. The first one is the construction of a list of relevant keywords that serves as the primary data collecting device. This is not only a technical necessity, but also secures a theoretically and empirically valid collection of data. The second concern is the quality of electronic archive information. Finally, source-specific data characteristics and coding difficulties are dealt with. In conclusion, seven guidelines for electronic data collecting are proposed.