Sampling issues for classification using neural networks

Neural networks are information processing systems patterned after the highly interconnected neural systems of the human brain. They are trained through examples rather than being programmed. The currently popular training algorithm, called back propagation, suffers from serious drawbacks such as lack of robustness, slow convergence, especially for large networks, and lack of reliability. Since neural network training is an optimization problem, this dissertation establishes the suitability of proven nonlinear optimization methods and the superiority of these methods over back propagation. Neural networks, by their nature of being able to generalize and resist noisy data, are particularly appropriate for pattern recognition problems. One prominent pattern recognition problem is the classification problem which assigns an observation, based on a set of attributes, to one of finite groups. For example, the classification problem is used in accepting or rejecting credit application based on an applicant's personal and financial data. This dissertation compares the performance, in terms of correct classifications, of neural networks against that of the traditional multidimensional discriminant analysis methods. The results show that even under the perfect assumptions for the traditional methods, neural networks compared favorably. The question of how to design neural networks for classification is the next main topic of this dissertation. The issues investigated include sampling strategy and network architecture. Sampling strategy refers to the decisions on sample size, sample composition, and variance-covariance matrices of attributes. Network architecture refers to the number of nodes and the interconnections among the nodes in a network. A rigorous and extensive experiment was conducted to answer questions on these issues. Design principles based on these empirical results were established.