Probabilistic local reconstruction for k-NN regression and its application to virtual metrology in semiconductor manufacturing

The ''locally linear reconstruction'' (LLR) provides a principled and k-insensitive way to determine the weights of k-nearest neighbor (k-NN) learning. LLR, however, does not provide a confidence interval for the k neighbors-based reconstruction of a query point, which is required in many real application domains. Moreover, its fixed linear structure makes the local reconstruction model unstable, resulting in performance fluctuation for regressions under different k values. Therefore, we propose a probabilistic local reconstruction (PLR) as an extended version of LLR in the k-NN regression. First, we probabilistically capture the reconstruction uncertainty by incorporating Gaussian regularization prior into the reconstruction model. This prevents over-fitting when there are no informative neighbors in the local reconstruction. We then project data into a higher dimensional feature space to capture the non-linear relationship between neighbors and a query point when a value of k is large. Preliminary experimental results demonstrated that the proposed Bayesian kernel treatment improves accuracy and k-invariance. Moreover, from the experiment on a real virtual metrology data set in the semiconductor manufacturing, it was found that the uncertainty information on the prediction outcomes provided by PLR supports more appropriate decision making.

[2]  Gerhard Widmer,et al.  Learning in the Presence of Concept Drift and Hidden Contexts , 1996, Machine Learning.

[3]  Patrick Xuechun Zhao,et al.  A nearest neighbor approach for automated transporter prediction and categorization from protein sequences , 2008, Bioinform..

[4]  Costas J. Spanos,et al.  Semiconductor yield improvement: results and best practices , 1995 .

[5]  D. Ruprecht,et al.  A Framework for Generalized Scattered Data Interpolation , 1994 .

[6]  Hyoungjoo Lee,et al.  Virtual metrology for run-to-run control in semiconductor manufacturing , 2011, Expert Syst. Appl..

[7]  Stephen J. Roberts,et al.  Novelty, confidence and errors in connectionist systems , 1996 .

[8]  Yuan Kang,et al.  Virtual Metrology Technique for Semiconductor Manufacturing , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[9]  Javier M. Moguerza,et al.  Estimation of high-density regions using one-class neighbor machines , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  S. Yakowitz NEAREST‐NEIGHBOUR METHODS FOR TIME SERIES ANALYSIS , 1987 .

[11]  Lawrence K. Saul,et al.  Think Globally, Fit Locally: Unsupervised Learning of Low Dimensional Manifold , 2003, J. Mach. Learn. Res..

[12]  Chen-Fu Chien,et al.  Semiconductor fault detection and classification for yield enhancement and manufacturing intelligence , 2012, Flexible Services and Manufacturing Journal.

[13]  Shuigeng Zhou,et al.  A pattern-based nearest neighbor search approach for promoter prediction using DNA structural profiles , 2009, Bioinform..

[14]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[15]  Marcos Salganicoff,et al.  Tolerating Concept and Sampling Shift in Lazy Learning Using Prediction Error Context Switching , 1997, Artificial Intelligence Review.

[16]  W. R. Schucany,et al.  Gaussian‐based kernels , 1990 .

[17]  David Zhang,et al.  Face recognition based on discriminant fractional Fourier feature extraction , 2006, Pattern Recognit. Lett..

[18]  Hyoungjoo Lee,et al.  A virtual metrology system for semiconductor manufacturing , 2009, Expert Syst. Appl..

[19]  George Wolberg,et al.  Digital image warping , 1990 .

[20]  Robert P. W. Duin,et al.  The interaction between classification and reject performance for distance-based reject-option classifiers , 2006, Pattern Recognit. Lett..

[21]  Fan-Tien Cheng,et al.  A virtual metrology scheme for predicting CVD thickness in semiconductor manufacturing , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[22]  Sungzoon Cho,et al.  Estimating the Reliability of Virtual Metrology Predictions in Semiconductor Manufacturing : A Novelty Detection-based Approach , 2012 .

[23]  M. Naderi Think globally... , 2004, HIV prevention plus!.

[24]  Alexey Tsymbal,et al.  The problem of concept drift: definitions and related work , 2004 .

[25]  Ginés Rubio,et al.  Design of specific-to-problem kernels and use of kernel weighted K-nearest neighbours for time series modelling , 2010, Neurocomputing.

[26]  Sameer Singh,et al.  Nearest-neighbour classifiers in natural scene analysis , 2001, Pattern Recognit..

[27]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[28]  Xuegong Zhang,et al.  Kernel Nearest-Neighbor Algorithm , 2002, Neural Processing Letters.

[29]  Chandan Srivastava,et al.  Support Vector Data Description , 2011 .

[30]  Bernard Dubuisson,et al.  A statistical decision rule with incomplete knowledge about classes , 1993, Pattern Recognit..

[31]  Fan-Tien Cheng,et al.  Evaluating Reliance Level of a Virtual Metrology System , 2008, IEEE Transactions on Semiconductor Manufacturing.

[32]  Léon Bottou,et al.  Local Learning Algorithms , 1992, Neural Computation.

[33]  Sungzoon Cho,et al.  Bootstrap Based Pattern Selection for Support Vector Regression , 2008, PAKDD.

[34]  Don Coppersmith,et al.  Matrix multiplication via arithmetic progressions , 1987, STOC.

[35]  Karol Kozak,et al.  Weighted k-Nearest-Neighbor Techniques for High Throughput Screening Data , 2007 .

[36]  Fan-Tien Cheng,et al.  Application development of virtual metrology in semiconductor industry , 2005, 31st Annual Conference of IEEE Industrial Electronics Society, 2005. IECON 2005..

[37]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[38]  Sahibsingh A. Dudani The Distance-Weighted k-Nearest-Neighbor Rule , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[39]  Sungzoon Cho,et al.  Locally linear reconstruction for instance-based learning , 2008, Pattern Recognit..

[40]  Neil J. Hurley,et al.  Collaborative recommendation: A robustness analysis , 2004, TOIT.

[41]  Haw Ching Yang,et al.  Multivariate simulation assessment for virtual metrology , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[42]  C. Holmes,et al.  A probabilistic nearest neighbour method for statistical pattern recognition , 2002 .

[43]  Žliobait . e,et al.  Learning under Concept Drift: an Overview , 2010 .

[44]  Li Ma,et al.  Local Manifold Learning-Based $k$ -Nearest-Neighbor for Hyperspectral Image Classification , 2010, IEEE Transactions on Geoscience and Remote Sensing.

[45]  C.H. Yu,et al.  Virtual metrology: a solution for wafer to wafer advanced process control , 2005, ISSM 2005, IEEE International Symposium on Semiconductor Manufacturing, 2005..

[46]  David J. Hand,et al.  Classifier Technology and the Illusion of Progress , 2006, math/0606441.

[47]  Eyke Hüllermeier,et al.  Efficient instance-based learning on data streams , 2007, Intell. Data Anal..

[48]  Indre Zliobaite,et al.  Learning under Concept Drift: an Overview , 2010, ArXiv.

[49]  Cheng-Ching Yu,et al.  Control relevant issues in semiconductor manufacturing : Overview with some new results , 2007 .

[50]  Boris N. Oreshkin,et al.  Machine learning approaches to network anomaly detection , 2007 .