Exploring due date reliability in production systems using data mining methods adapted from gene expression analysis

Abstract Identifying causes of lateness in multistage production systems demands methods for considering a high-dimensional order and process attribute space. Simultaneous measurement of expression levels of thousands of genes in a biological cell provides a data set for understanding robust cellular function. Methods developed in computational systems biology for analyzing gene expression data enable the identification of the most influential criteria sets. Gene expression is the production process of functional elements (enzymes, proteins) in a biological cell. Logistics data analysis faces a similar challenge: What attributes of orders can be associated with high and low punctuality? We combine methods from cluster analysis and computational systems biology to explore the relationship between order and resource parameters and lateness. With this novel approach we determine intrinsic interdependencies between order parameters and process parameters. For the case study described here, this approach has improved the precision of predicting the lateness of an order by 14% compared to a majority vote among neighboring orders in parameter space.