Automatic feature selection for unsupervised clustering of cycle-based signals in manufacturing processes

Recent developments in sensing and computer technology have resulted in most manufacturing processes becoming a data-rich environment. A cycle-based signal refers to an analog or digital signal that is obtained during each repetition of an operation cycle in a manufacturing process. It is a very important class of in-process sensing signals for manufacturing processes because it contains extensive information on the process condition and product quality (e.g., the forming force signal in forging processes). In contrast with currently available supervised classification approaches that heavily depend on the training dataset or engineering field knowledge, this paper aims to develop an automatic feature selection method for the unsupervised clustering of cycle-base signals. First, principal component analysis is applied to the raw signals. Then a new method is proposed to select information containing principal components to allow clustering to be performed. The dimension of the problem can be significantly reduced through the use of these two steps. Finally, a model-based clustering method is applied to the selected principal components to find the clusters in the cycle-based signals. A numerical example and a real-world example of a forging process are used to illustrate the effectiveness of the proposed method. The proposed technique is an important data pre-processing technique for the monitoring and diagnostic system development using cycle-based signals for manufacturing processes.

[1]  K. Mehrotra,et al.  Tests for Univariate and Multivariate Normality. , 1976 .

[2]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[3]  Bruce M. Hill,et al.  Information for Estimating the Proportions in Mixtures of Exponential and Normal Distributions , 1963 .

[4]  David A. Landgrebe,et al.  Supervised classification in high-dimensional space: geometrical, statistical, and asymptotical properties of multivariate data , 1998, IEEE Trans. Syst. Man Cybern. Part C.

[5]  J. Edward Jackson,et al.  A User's Guide to Principal Components. , 1991 .

[6]  Jianjun Shi,et al.  Automatic feature extraction of waveform signals for in-process diagnostic performance improvement , 2001, J. Intell. Manuf..

[7]  Yutaka Tanaka,et al.  Principal component analysis based on a subset of variables: variable selection and sensitivity analysis , 1997 .

[8]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[9]  G. De Soete,et al.  Clustering and Classification , 2019, Data-Driven Science and Engineering.

[10]  Sagar V. Kamarthi,et al.  Feature Extraction From Wavelet Coefficients for Pattern Recognition Tasks , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  James R. Schott,et al.  Matrix Analysis for Statistics , 2005 .

[12]  David G. Stork,et al.  Pattern Classification , 1973 .

[13]  W. C. Chang The Effects of Adding a Variable in Dissecting a Mixture of Two Normal Populations with a Common Covariance Matrix , 1976 .

[14]  Jon R. Kettenring,et al.  Variable selection in clustering and other contexts , 1987 .

[15]  Jionghua Jin,et al.  Feature-preserving data compression of stamping tonnage information using wavelets , 1999 .

[16]  Jianjun Shi,et al.  Multiple Fault Detection and Isolation Using the Haar Transform, Part 2: Application to the Stamping Process , 1999 .

[17]  Emily K. Lada,et al.  A wavelet-based procedure for process fault detection , 2002 .

[18]  P. R. Nelson Design, Data, and Analysis by Some Friends of Cuthbert Daniel , 1988 .

[19]  Andreas Karlsson,et al.  Matrix Analysis for Statistics , 2007, Technometrics.

[20]  J. Edward Jackson,et al.  A User's Guide to Principal Components: Jackson/User's Guide to Principal Components , 2004 .

[21]  J. E. Jackson A User's Guide to Principal Components , 1991 .

[22]  Wei-Chien Chang On using Principal Components before Separating a Mixture of Two Multivariate Normal Distributions , 1983 .

[23]  Richard F. Gunst,et al.  Applied Regression Analysis , 1999, Technometrics.

[24]  Jun S. Liu,et al.  Bayesian Clustering with Variable and Transformation Selections , 2003 .

[25]  Jianjun Shi,et al.  Multiple Fault Detection and Isolation Using the Haar Transform, Part 1: Theory , 1999 .

[26]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[27]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[28]  Jionghua Jin,et al.  Diagnostic Feature Extraction From Stamping Tonnage Signals Based on Design of Experiments , 2000 .

[29]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[30]  Ka Yee Yeung,et al.  Principal component analysis for clustering gene expression data , 2001, Bioinform..

[31]  N. E. Day Estimating the components of a mixture of normal distributions , 1969 .

[32]  Robert V. Brill,et al.  Applied Statistics and Probability for Engineers , 2004, Technometrics.