Use of textual elements to improve reliability prediction for aircraft component behavior

Unplanned maintenance is a costly factor in aircraft operations. Predictive maintenance models aim to provide greater insight into future component and system behaviour. In the state of the art, a variety of statistical models and machine learning techniques, amongst others, are used to estimate component remaining useful life. These approaches commonly leverage technical information, such as sensor data. However, the use of data and techniques from other domains is not prevalent. One such example is the application of natural language processing to incorporate textual information, e.g. derived from pilot complaint data. In other words, does the presence and specific content of pilot complaints have potential to improve the predictability of component removals? In this research, data integration and processing from multiple disciplines are combined to address this question. Relevant words from pilot complaints are identified using a term frequency–inverse document frequency (TF-IDF) numerical analysis, after which the most relevant words are used as covariates in a proportional hazards model. Left truncation and right censoring is applied to limit the time-invariant nature of these covariates. The results in the form of hazard ratios indicate a hazard increase of several orders of magnitude with respect to baseline hazard, pointing towards potential value of including these words as predictive parameters.