Error categories, detection, and reduction in a musical database

The results of a detailed study of errors in a musical database are presented. It is shown that the form of analytic processing employed may substantially magnify the effect of database errors so that very small error rates can produce very large errors in the analytic results. An equation is devised to assist in estimating the effect of an overall error-rate on arbitrary analytic measures derived from the database. Considering the potential for erroneous or misleading analytic results, it is recommended that scholars using computer-based methods should habitually calculate error rates associated with their analytic procedures, and ought to present an analysis of errors in tandem with their results in order to validate their interpretations.Four methods of error detection are examined: manual proof-reading, double-entry method, programmed syntactic checking, and programmed heuristics. Significant differences were found in the thoroughness of different detection methods in uncovering all errors of a given type. The double-entry method was found to be superior to all other methods of detection; the humanities scholar's traditional allegiance to manual proof-reading was not supported by this study. Programmed methods of error detection were found to be fallible, but nonetheless useful.