Outlier detection using PCA mix based T2 control chart for continuous and categorical data

Abstract Outliers presence may lead to misdetection on out-of-control observations in Phase II, therefore, they should be cleaned in Phase I. This paper proposes PCA Mix based T2 chart with Kernel Density control limit for mixed continuous and categorical data. Simulation studies are conducted to evaluate the performance of proposed chart in detecting outliers from clean and contaminated data. The proposed chart has better performance than the benchmark in monitoring clean data. For contaminated data, proposed chart has optimal performance in situation when categorical data are generated from multinomial distribution with balanced parameters. This is confirmed by simulated and real dataset. Compared to the conventional and other robust charts, the proposed chart demonstrated a great performance by success to detect more outlier correctly for the higher percentage of outlier added.

[1]  Douglas C. Montgomery,et al.  Using Control Charts to Monitor Process and Product Quality Profiles , 2004 .

[2]  John C. Young,et al.  THE CONTROL CHART FOR INDIVIDUAL OBSERVATIONS FROM A MULTIVARIATE NON-NORMAL DISTRIBUTION , 2001 .

[3]  J. Birch,et al.  On the Distribution of Hotelling ’ s T 2 Statistic Based on the Successive Differences Covariance Matrix Estimator , 2005 .

[4]  William H. Woodall,et al.  A Comparison of Multivariate Control Charts for Individual Observations , 1996 .

[5]  Seoung Bum Kim,et al.  Principal component analysis-based control charts for multivariate nonnormal distributions , 2013, Expert Syst. Appl..

[6]  P. Rousseeuw,et al.  Unmasking Multivariate Outliers and Leverage Points , 1990 .

[7]  J RousseeuwPeter,et al.  A fast algorithm for the minimum covariance determinant estimator , 1999 .

[8]  William H. Woodall,et al.  Distribution of Hotelling's T2 Statistic Based on the Successive Differences Estimator , 2006 .

[9]  J. Edward Jackson,et al.  Quality Control Methods for Several Related Variables , 1959 .

[10]  Dongdong Xiang,et al.  Mixed Variables-Attributes Test Plans for Single and Double Acceptance Sampling under Exponential Distribution , 2011 .

[11]  N. José Alberto Vargas,et al.  Robust Estimation in Multivariate Control Charts for Individual Observations , 2003 .

[12]  Peter J. Rousseeuw,et al.  Robust regression and outlier detection , 1987 .

[13]  W. A. Shewhart,et al.  Some applications of statistical methods to the analysis of physical and engineering data , 1924 .

[14]  Seoung Bum Kim,et al.  Bootstrap-Based T 2 Multivariate Control Charts , 2011, Commun. Stat. Simul. Comput..

[15]  H. Kiers,et al.  Three-way methods for the analysis of qualitative and quantitative two-way data. , 1991 .

[16]  M. Ahsan,et al.  Multivariate control chart based on PCA mix for variable and attribute quality characteristics , 2018 .

[17]  Michael B. C. Khoo,et al.  A comparison of multivariate control charts for skewed distributions using weighted standard deviations , 2014 .

[18]  Z. Omar,et al.  An alternative hotelling T^2 control chart based on Minimum Vector Variance (MVV) , 2011 .

[19]  A. Madansky Identification of Outliers , 1988 .

[20]  A. Erhan Mergen,et al.  IMPROVING THE PERFORMANCE OF THE T2 CONTROL CHART , 1993 .

[21]  Moustafa Omar Ahmed Abu-Shawiesh,et al.  A Robust Bivariate Control Chart Alternative to the Hotelling's T2 Control Chart , 2014, Qual. Reliab. Eng. Int..

[22]  Principal Components Analysis on a mixture of quantitative and qualitative data based on generalized correlation coefficients , 1988 .

[23]  J. Alfaro,et al.  A comparison of robust alternatives to Hotelling’s T 2 control chart , 2009 .

[24]  Stefan H. Steiner,et al.  A Multivariate Robust Control Chart for Individual Observations , 2009 .

[25]  J. Pan,et al.  New robust estimators for detecting non-random patterns in multivariate control charts: a simulation approach , 2011 .

[26]  William H. Woodall,et al.  High breakdown estimation methods for Phase I multivariate control charts , 2007, Qual. Reliab. Eng. Int..

[27]  Katrien van Driessen,et al.  A Fast Algorithm for the Minimum Covariance Determinant Estimator , 1999, Technometrics.

[28]  Chi-Hyuck Jun,et al.  Mixed Control Charts Using EWMA Statistics , 2016, IEEE Access.

[29]  Dyah Erny Herwindiati,et al.  Robust Multivariate Outlier Labeling , 2007, Commun. Stat. Simul. Comput..

[30]  Muhammad Azam,et al.  A mixed control chart to monitor the process , 2015 .

[31]  Douglas M. Hawkins Identification of Outliers , 1980, Monographs on Applied Probability and Statistics.

[32]  Marie Chavent,et al.  Multivariate analysis of mixed data: The PCAmixdata R package , 2014 .

[33]  H. Kiers Simple structure in component analysis techniques for mixtures of qualitative and quantitative variables , 1991 .