XML Node Semantic Weight Model Based on VSM

XML has become an important format for exchange data. Ranking of XML search results directly relates to XML information retrieval performance. Most of the existing ranking models consider words statistical characteristics in the XML document, but they do not consider position of the node a word belongs to. That is to say, all of nodes in XML document have the equal importance. However, different node plays different role in the entire XML document. So, the same content in different node should have different weight. It means different nodes should have different node semantic weight. In this paper, we present a VSM based method for XML node semantic weight (XNSW-VSM), which is scaled by the similarity between the node and the whole document. Experiment data were selected from Wiki data sets. The Pearson correlation coefficient between semantic results given by experts and the model results is 0.827. It shows the node semantic weight model can analyze the importance of node in XML document and it will be helpful for improving ranking results.