Data mining in an engineering design environment: OR applications from graph matching

Data mining has been making inroads into the engineering design environment--an area that generates large amounts of heterogeneous data for which suitable mining methods are not readily available. For instance, an unsupervised data mining task (clustering) requires an accurate measure of distance or similarity. This paper focuses on the development of an accurate similarity measure for bills of materials (BOM) that can be used to cluster BOMs into product families and subfamilies. The paper presents a new problem called tree bundle matching (TBM) that is identified as a result of the research, gives a non-polynomial formulation, a proof that the problem is NP-hard, and suggests possible heuristic approaches.In a typical life cycle of an engineering project or product, enormous amounts of diverse engineering data are generated. Some of these include BOM, product design models in CAD, engineering drawings, manufacturing process plans, quality and test data, and warranty records. Such data contain information crucial for efficient and timely development of new products and variants; however, this information is often not available to designers. Our research employs data mining methods to extract this design information and improve its accessibility to design engineers. This paper focuses on one aspect of the overall research agenda, clustering BOMs into families and subfamilies. It extends previous work on a graph-based similarity measure for BOMs (a class of unordered trees) by presenting a new TBM problem, and proves the problem to be NP-hard. The overall contribution of this work is to demonstrate the OR applications from graph matching, stochastic methods, optimization, and others to data mining in the engineering design environment.

[1]  Rakesh Nagi,et al.  On comparing bills of materials: a similarity/distance measure for unordered trees , 2005, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[2]  Kuo-Chung Tai,et al.  The Tree-to-Tree Correction Problem , 1979, JACM.

[3]  Hmh Herman Hegge,et al.  Generic bill-of-material: a new product model , 1991 .

[4]  Kaizhong Zhang,et al.  Tree pattern matching , 1997, Pattern Matching Algorithms.

[5]  Hector Garcia-Molina,et al.  Meaningful change detection in structured data , 1997, SIGMOD '97.

[6]  Heikki Mannila,et al.  Ordered and Unordered Tree Inclusion , 1995, SIAM J. Comput..

[7]  Dan Braha Data mining for design and manufacturing: methods and applications , 2001 .

[8]  Rakesh Nagi,et al.  A data mining-based engineering design support system: a research agenda , 2001 .

[9]  Roger Jianxin Jiao,et al.  A methodology of developing product family architecture for mass customization , 1999, J. Intell. Manuf..

[10]  Kaizhong Zhang,et al.  Exact and approximate algorithms for unordered tree matching , 1994, IEEE Trans. Syst. Man Cybern..

[11]  Paul Cull,et al.  Exact learning of unordered tree patterns from queries , 1999, COLT '99.

[12]  Tao Jiang,et al.  Some MAX SNP-Hard Results Concerning Unordered Labeled Trees , 1994, Inf. Process. Lett..

[13]  Roger Jianxin Jiao,et al.  Generic Bill-of-Materials-and-Operations for High-Variety Production Management , 2000, Concurr. Eng. Res. Appl..

[14]  Joseph A. Orlicky,et al.  Material Requirements Planning: The New Way of Life in Production and Inventory Management , 1975 .

[15]  Paul Cull,et al.  Exact learning of tree patterns from queries and counterexamples , 1998, COLT' 98.

[16]  Z. Galil,et al.  Pattern matching algorithms , 1997 .

[17]  Christoph M. Hoffmann,et al.  Pattern Matching in Trees , 1982, JACM.

[18]  Lusheng Wang,et al.  Alignment of trees: an alternative to tree edit , 1995 .