The Effects of Scaling on Neural Network Classification

Artificial Neural Networks are often looked upon as black boxes that can be used for classification tasks. Regarding the ANN as a simple tool to do a final classification, the research efforts tend to be concentrated on preprocessing stages, to improve the quality of the input to the neural network. One such preprocessor is the MSECT algorithm by Zahorian and Jagharghi [1]. It improves vowel classification. Since MSECT applies an affine transformation to the data, it is hard to see why this should make any difference to the end result. By implementing and testing the MSECT algoritm, using a simple backpropagation neural network as a tool or standard to measure the amount of neural network training needed to correctly classify two data clusters we confirmed the results of Zahorian and Jagharghi [2]. The simple ANN we used to classify the vowel data was not that simple at all. The preprocessing algorithm changes not only the dimension but also the scale of the vowel data. To perform optimal on the original data and on the preprocessed data the ANN would need different optimal parameters. But because the parameters of the ANN were not modified this preprocessing could result in better results for one of the data sets. An experiment was done with different scalings of the same data sets. For the parameters of the ANN we used, the optimal results in terms of speed of convergence and accuracy were obtained for data scaled to have their input range between 5 and 18.