A data quality metrics hierarchy for reliability data

In this paper, we describe an approach to understanding data quality issues in field data used for the calculation of reliability metrics such as availability, reliability over time, or MTBF. The focus lies on data from sources such as maintenance management systems or warranty databases which contain information on failure times, failure modes for all units. We propose a hierarchy of data quality metrics which identify and assess key problems in the input data. The metrics are organized in such a way that they guide the data analyst to those problems with the most impact on the calculation and provide a prioritised action plan for the improvement of data quality. The metrics cover issues such as missing, wrong, implausible and inaccurate data. We use examples with real-world data to showcase our software prototype and to illustrate how the metrics have helped with data preparation. Using this way, analysts can reduce the amount of wrong conclusions drawn from the data to mistakes in the input values.