Using an Existing Data Set to Answer New Research Questions: A Methodological Review

The vast majority of the research methods literature assumes that the researcher designs the study subsequent to determining research questions. This assumption is not met for the many researchers involved in secondary data analysis. Researchers doing secondary data analysis need not only understand research concepts related to designing a new study, but additionally must be aware of challenges specific to conducting research using an existing data set. Techniques are discussed to determine if secondary data analysis is appropriate. Suggestions are offered on how to best identify, obtain, and evaluate a data set; refine research questions; manage data; calculate power; and report results. Examples from nursing research are provided. If an existing data set is suitable for answering a new research question, then a secondary analysis is preferable since it can be completed in less time, for less money, and with far lower risks to subjects. The researcher must carefully consider if the existing data set’s available power and data quality are adequate to answer the proposed research questions.

[1]  P Bacchetti,et al.  Influence of gender on cardiovascular mortality in acute myocardial infarction patients with high indication for coronary angiography. , 1997, Circulation.

[2]  Barry Meier Drug industry plans release of more data about studies. , 2005, The New York times on the Web.

[3]  William C. Schefler,et al.  Statistics for health professionals , 1979 .

[4]  Min Sohn,et al.  High Rates of Sustained Smoking Cessation in Women Hospitalized With Cardiovascular Disease: The Women’s Initiative for Nonsmoking (WINS) , 2004, Circulation.

[5]  B. Drew,et al.  Prehospital delay time in acute myocardial infarction: an exploratory study on relation to hospital outcomes and cost. , 2000, American heart journal.

[6]  Jakob B. Bjorner,et al.  The feasibility of applying item response theory to measures of migraine impact: A re-analysis of three clinical studies , 2003, Quality of Life Research.

[7]  A F Jacobson,et al.  Obtaining and Evaluating Data Sets for Secondary Analysis in Nursing Research , 1993, Western journal of nursing research.

[8]  A G Mainous,et al.  Using other people's data: the ins and outs of secondary data analysis. , 1997, Family medicine.

[9]  M A Hlatky,et al.  Variation among hospitals in coronary-angiography practices and outcomes after myocardial infarction in a large health maintenance organization. , 1996, The New England journal of medicine.

[10]  R. M. Rubison,et al.  Treatment of myocardial infarction in the United States (1990 to 1993). Observations from the National Registry of Myocardial Infarction. , 1994, Circulation.

[11]  Jules J. Berman A tool for sharing annotated research data: the "Category 0" UMLS (Unified Medical Language System) vocabularies , 2003, BMC Medical Informatics Decis. Mak..

[12]  Karen K Giuliano,et al.  Generating New Knowledge From Existing Data: The Use of Large Data Sets for Nursing Research , 2006, Nursing research.

[13]  Shawn M Kneipp,et al.  Complex Sampling Designs and Statistical Issues in Secondary Analysis , 2002, Western journal of nursing research.

[14]  W. Shadish,et al.  Experimental and Quasi-Experimental Designs for Generalized Causal Inference , 2001 .

[15]  S. Clarke,et al.  Secondary analysis: theoretical, methodological, and practical considerations. , 2000, The Canadian journal of nursing research = Revue canadienne de recherche en sciences infirmieres.

[16]  E. Froelicher,et al.  Women's initiative for nonsmoking (WINS) I: design and methods. , 2000, Heart & lung : the journal of critical care.