A neural network model of a contact plasma etch process for VLSI production

The etch process for preparation of via contacts in VLSI manufacturing is described along with a neural network model of the process. The neural network is a two hidden layer network (23-3-3-1) trained by error back-propagation. The input variables to the model are the mean values of set-point fluctuations for the control variables of the plasma reactor, and the output is the oxide thickness remaining after the etch. The model is thus abstracted by several levels of reality. The real-world process results in a film thickness about 24 000 /spl Aring/ and a standard deviation of about 730 /spl Aring/. We demonstrate that a neural network model can predict the post-etch oxide thickness to within 480 /spl Aring/ and that inherent noise in the training/testing data is 416 /spl Aring/. We also demonstrate that the dc bias and the etch times are the most important variables to determine the final product quality.

[1]  Vera Kurková,et al.  Kolmogorov's theorem and multilayer neural networks , 1992, Neural Networks.

[2]  Murray Smith,et al.  Neural Networks for Statistical Modeling , 1993 .

[3]  Edward A. Rietman,et al.  Neural network control of a plasma gate etch: Early steps in wafer-to-wafer process control , 1993, Proceedings of 15th IEEE/CHMT International Electronic Manufacturing Technology Symposium.

[4]  Kurt Hornik,et al.  Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.

[5]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[6]  G. Lorentz METRIC ENTROPY, WIDTHS, AND SUPERPOSITIONS OF FUNCTIONS , 1962 .

[7]  E. Clothiaux,et al.  Neural Networks and Their Applications , 1994 .

[8]  Tomaso A. Poggio,et al.  Representation Properties of Networks: Kolmogorov's Theorem Is Irrelevant , 1989, Neural Computation.

[9]  Youngohc Yoon,et al.  A Comparison of Discriminant Analysis versus Artificial Neural Networks , 1993 .

[10]  Madhan Shridhar Phadke,et al.  Quality Engineering Using Robust Design , 1989 .

[11]  Edward A. Rietman,et al.  Process models and network complexity , 1993, IEEE International Conference on Neural Networks.

[12]  Bernard Widrow,et al.  Associative Storage and Retrieval of Digital Information in Networks of Adaptive “Neurons” , 1962 .

[13]  Cihan H. Dagli Artificial Neural Networks for Intelligent Manufacturing , 1994 .

[14]  Y. L. Cun Learning Process in an Asymmetric Threshold Network , 1986 .

[15]  Lawrence D. Jackel,et al.  Learning Curves: Asymptotic Values and Rate of Convergence , 1993, NIPS.

[16]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[17]  Gary S. May,et al.  An optimal neural network process model for plasma etching , 1994 .

[18]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[19]  Gary S. May,et al.  A comparison of statistically-based and neural network models of plasma etch behavior , 1992, [1992 Proceedings] IEEE/SEMI International Semiconductor Manufacturing Science Symposium.

[20]  Vra Krkov Kolmogorov's Theorem Is Relevant , 1991, Neural Computation.

[21]  Halbert White,et al.  Connectionist nonparametric regression: Multilayer feedforward networks can learn arbitrary mappings , 1990, Neural Networks.

[22]  Terry R. Turner,et al.  Etch process characterization using neural network methodology: a case study , 1992, Other Conferences.

[23]  D. Sprecher On the structure of continuous functions of several variables , 1965 .

[24]  Gary S. May,et al.  Advantages of plasma etch modeling using neural networks over statistical techniques , 1993 .

[25]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[26]  Edward A. Rietman Classical control theory, Kolmogorov's theorem, and automata networks , 1995 .

[27]  Babu Joseph,et al.  Exploratory data analysis: A comparison of statistical methods with artificial neural networks , 1992 .

[28]  Edward A. Rietman,et al.  A production demonstration of wafer-to-wafer plasma gate etch control by adaptive real-time computation of the over-etch time from in situ process signals , 1995 .

[29]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[30]  D. Manos,et al.  Plasma etching : an introduction , 1989 .

[31]  B. L. Kalman,et al.  Why tanh: choosing a sigmoidal function , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[32]  Bernard Widrow,et al.  Adaptive switching circuits , 1988 .

[33]  Kurt Hornik,et al.  Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks , 1990, Neural Networks.

[34]  Bernardete Ribeiro,et al.  Kolmogorov’s Theorem: From Algebraic Equations and Nomography to Neural Networks , 1993 .

[35]  Thomas F. Edgar,et al.  Constructing a reliable neural network model for a plasma etching process using limited experimental data , 1994 .

[36]  Sholom M. Weiss,et al.  Computer Systems That Learn , 1990 .

[37]  Edward A. Rietman,et al.  Use of neural networks in modeling semiconductor manufacturing processes: an example for plasma etch modeling , 1993 .