A SOM mapping technique for visualizing documents in a database

A method is introduced for mapping documents, based on document citations, on a two dimensional map for clustering and visualization for the application of technology forecasting. The citation data is used to build an adjacency matrix which describes the document set as an undirected graph. The dimensionality of the adjacency matrix is reduced using principal components analysis. The reduced dimension data is used to train a small rectangular self organizing map (SOM). After training, each document's input vector is premultiplied by the SOM weight matrix to find a spatial response across the SOM and the centroid of this response is used to map the document. The ordination method is demonstrated on a synthetic data set with good results. Further encouraging results using an actual 118 polymer document dataset are also shown.