Methodology to improve data quality from chart review in the managed care setting.

BACKGROUND Because inherent variability may exist in data collected by multiple reviewers or from potential difficulties with data abstraction tools, we developed a standardized method of evaluating interrater reliability (IRR) for clinical studies, HEDIS effectiveness of care measures, and onsite/medical record reviews. OBJECTIVE To demonstrate the ability of our standardized methods of data collection and analysis of results to determine the extent of agreement between multiple reviewers; identify areas for improvement in data collection procedures; and improve data reliability. STUDY DESIGN A prospective chart review with concurrent IRR. METHODS A subsample of patient records included in the Highmark Blue Cross Blue Shield/Keystone Health Plan West basic medical review for each HEDIS measure was selected for the IRR study. An experienced nurse ("gold standard") conducted a blinded concurrent review of these records. Using the kappa statistic (kappa) we evaluated interobserver agreement between results of the onsite reviewers and the "gold standard" from 1997 through 2000. Revised data collection methods and enhanced reviewer training were incorporated for measures showing areas for rater improvement. RESULTS Results across years showed excellent IRR for most measures; however, each year 1 or 2 measures showed areas for rater improvement (1997 Papanicolaou kappa = 0.50; 1998 well-child visits 3 to 6 years kappa = 0.37; 1999 comprehensive diabetes kappa = 0.73; high blood pressure kappa = 0.73). After reevaluating these measures, the results of the kappa showed excellent interrater agreement in subsequent years. CONCLUSIONS Standardized methods of data collection and evaluation of IRR results provides health plans increased confidence in data collection, statistical analyses, and in reaching conclusions and deriving relevant recommendations.

[1]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[2]  Jacob Cohen,et al.  Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. , 1968 .

[3]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[4]  J. R. Landis,et al.  An application of kappa-type analyses to interobserver variation in classifying chest radiographs for pneumoconiosis. , 1984, Statistics in medicine.

[5]  Kathi J. Kemper,et al.  The reliability and validity of the pediatric appropriateness evaluation protocol. , 1989, QRB. Quality review bulletin.

[6]  A M Epstein,et al.  The outcomes movement--will it get us where we want to go? , 1990, The New England journal of medicine.

[7]  D. Ronis,et al.  Reliability and Validity of Utilization Review Criteria: Appropriateness Evaluation Protocol, Standardized Medreview Instrument, and Intensity-Severity-Discharge Criteria , 1990, Medical care.

[8]  K. Kemper,et al.  Interobserver Variability in Assessing Pediatric Postextubation Stridor , 1992, La Clinica pediatrica.

[9]  R. Berenson,et al.  A national survey of the arrangements managed-care plans make with physicians. , 1995, The New England journal of medicine.

[10]  Robert H. Brook,et al.  Part 2: Measuring Quality of Care , 1996 .

[11]  B. Mozes,et al.  Medical Patients Assessment Protocol: A Tool for Evaluating the Appropriateness of Utilizing Hospital-Stay Days for Acute Medical Patients; Development, Reliability and Applications , 1996, American journal of medical quality : the official journal of the American College of Medical Quality.

[12]  M. Chassin,et al.  The urgent need to improve health care quality. Institute of Medicine National Roundtable on Health Care Quality. , 1998, JAMA.

[13]  M F Swiontkowski,et al.  Outcomes assessment in the information age: available instruments, data collection, and utilization of data. , 1999, Instructional course lectures.

[14]  A. Liberati,et al.  Reliability study of the European appropriateness evaluation protocol. , 1999, International journal for quality in health care : journal of the International Society for Quality in Health Care.

[15]  J. Restuccia,et al.  Liability and validity of the Appropriateness Evaluation Protocol in Turkey. , 2000, International journal for quality in health care : journal of the International Society for Quality in Health Care.

[16]  S. McQuillian Inter-rater reliability testing for utilization management staff. , 2001, Managed care.

[17]  G. Burlingame,et al.  Pushing the quality envelope: a new outcomes management system. , 2001, Psychiatric services.