Efficient Data Projection for Visual Analysis of Large Data Sets Using Neural Networks

The most classical visualization methods, including multidimensional scaling and its particular case – Sammon's mapping, encounter difficulties when analyzing large data sets. One of possible ways to solve the problem is the application of artificial neural networks. This paper presents the visualization of large data sets using the feed-forward neural network – SAMANN. This back propagation-like learning rule has been developed to allow a feed-forward artificial neural network to learn Sammon's mapping in an unsupervised way. In its initial form, SAMANN training is computation expensive. In this paper, we discover conditions optimizing the computational expenditure in visualization even of large data sets. It is shown possibility to reduce the original dimensionality of data to a lower one using small number of iterations. The visualization results of real-world data sets are presented.

[1]  Patrick J. F. Groenen,et al.  Modern Multidimensional Scaling: Theory and Applications , 2003 .

[2]  Gintautas Dzemyda,et al.  Topology Preservation Measures in the Visualization of Manifold-Type Multidimensional Data , 2009, Informatica.

[3]  Robert P. W. Duin,et al.  Sammon's mapping using neural networks: A comparison , 1997, Pattern Recognit. Lett..

[4]  Gintautas Dzemyda,et al.  Optimization of the Local Search in the Training for SAMANN Neural Network , 2006, J. Glob. Optim..

[5]  Panos M. Pardalos,et al.  Introduction to Global Optimization , 2000, Introduction to Global Optimization.

[6]  Kostas Karpouzis,et al.  Emerging Artificial Intelligence Applications in Computer Engineering - Real Word AI Systems with Applications in eHealth, HCI, Information Retrieval and Pervasive Technologies , 2007, Emerging Artificial Intelligence Applications in Computer Engineering.

[7]  Olga Kurasova,et al.  Integration of the Self-Organizing Map and Neural gas with Multidimensional Scaling , 2011, Inf. Technol. Control..

[8]  Gintautas Dzemyda,et al.  Parallel Realizations of the SAMANN Algorithm , 2007, ICANNGA.

[9]  J. Mockus,et al.  The Bayesian approach to global optimization , 1989 .

[10]  Gintautas Dzemyda,et al.  Large Datasets Visualization with Neural Network Using Clustered Training Data , 2008, ADBIS.

[11]  J. Edward Jackson,et al.  A User's Guide to Principal Components. , 1991 .

[12]  Anil K. Jain,et al.  Artificial neural network for nonlinear projection of multivariate data , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[13]  Gintautas Dzemyda,et al.  Heuristic approach for minimizing the projection error in the integrated mapping , 2006, Eur. J. Oper. Res..

[14]  Gintautas Dzemyda,et al.  Dimension Reduction and Data Visualization Using Neural Networks , 2007, Emerging Artificial Intelligence Applications in Computer Engineering.

[15]  Richard C. T. Lee,et al.  A Triangulation Method for the Sequential Mapping of Points from N-Space to Two-Space , 1977, IEEE Transactions on Computers.

[16]  Michael E. Tipping,et al.  Feed-forward neural networks and topographic mappings for exploratory data analysis , 1996, Neural Computing & Applications.

[17]  John W. Sammon,et al.  A Nonlinear Mapping for Data Structure Analysis , 1969, IEEE Transactions on Computers.

[18]  Anil K. Jain,et al.  Artificial neural networks for feature extraction and multivariate data projection , 1995, IEEE Trans. Neural Networks.

[19]  Walter A. Kosters,et al.  Nonmetric multidimensional scaling: Neural networks versus traditional techniques , 2004, Intell. Data Anal..