Secondary Data Sources for Public Health: An Introduction to Secondary Data Analysis

What Are Secondary Data? In the fields of epidemiology and public health, the distinction between primary and secondary data depends on the relationship between the person or research team who collected a data set and the person who is analyzing it. This is an important concept because the same data set could be primary data in one analysis and secondary data in another. If the data set in question was collected by the researcher (or a team of which the researcher is a part) for the specific purpose or analysis under consideration, it is primary data . If it was collected by someone else for some other purpose, it is secondary data . Of course, there will always be cases in which this distinction is less clear, but it may be useful to conceptualize primary and secondary data by considering two extreme cases. In the first, which is an example of primary data , a research team conceives of and develops a research project, collects data designed to address specific questions posed by the project, and performs and publishes their own analyses of the data they have collected. In this case, the people involved in analyzing the data have some involvement in, or at least familiarity with, the research design and data collection process, and the data were collected to answer the questions examined in the analysis.