Application of diffusion maps to identify human factors of self-reported anomalies in aviation.

A study investigating what factors are present leading to pilots submitting voluntary anomaly reports regarding their flight performance was conducted. Diffusion Maps (DM) were selected as the method of choice for performing dimensionality reduction on text records for this study. Diffusion Maps have seen successful use in other domains such as image classification and pattern recognition. High-dimensionality data in the form of narrative text reports from the NASA Aviation Safety Reporting System (ASRS) were clustered and categorized by way of dimensionality reduction. Supervised analyses were performed to create a baseline document clustering system. Dimensionality reduction techniques identified concepts or keywords within records, and allowed the creation of a framework for an unsupervised document classification system. Results from the unsupervised clustering algorithm performed similarly to the supervised methods outlined in the study. The dimensionality reduction was performed on 100 of the most commonly occurring words within 126,000 text records describing commercial aviation incidents. This study demonstrates that unsupervised machine clustering and organization of incident reports is possible based on unbiased inputs. Findings from this study reinforced traditional views on what factors contribute to civil aviation anomalies, however, new associations between previously unrelated factors and conditions were also found.