Trace Mining from Distributed Assembly Databases for Causal Analysis

Hierarchical structures of components often appear in industry, such as the components of cars. We focus on association mining from the hierarchically assembled data items that are characterized with identity labels such as lot numbers. Massive and physically distributed product databases make it difficult to directly find the associations of deep-level items. We propose a top-down algorithm using virtual lot numbers to mine association rules from the hierarchical databases. Virtual lot numbers delegate the identity information of the subcomponents to upper-level lot numbers without modifications to the databases. Our pruning method reduces the number of enumerated items and avoids redundant access to the databases. Experiments show that the algorithm works an order of magnitude faster than a naive approach.

[1]  William M. Pottenger,et al.  Distributed higher order association rule mining using information extracted from textual data , 2005, SKDD.

[2]  Jiawei Han,et al.  A fast distributed algorithm for mining association rules , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[3]  Ran Wolff,et al.  Communication-Efficient Distributed Mining of Association Rules , 2001, SIGMOD '01.

[4]  Masayuki Numao,et al.  Parts Traceability for Manufacturers , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[5]  Jiawei Han,et al.  Discovery of Multiple-Level Association Rules from Large Databases , 1995, VLDB.

[6]  Mohammed J. Zaki Parallel and distributed association mining: a survey , 1999, IEEE Concurr..

[7]  Srinivasan Parthasarathy,et al.  Parallel Algorithms for Discovery of Association Rules , 1997, Data Mining and Knowledge Discovery.