Similarity Measure and Structural Index of XML Documents
暂无分享,去创建一个
This paper presents a quantitative approach to measure the difference between two XML documents, called XED distance. An XML document can be represented as a concise, weighted, structural index tree. It is proven that the similarity between two XML documents can be measured by distance between their structural index trees. Since the structural index tree is dramatically smaller than the original tree, it can greatly reduce the cost for measuring the similarity between two XML documents. The approach presented in this paper can be used in many applications, such as approximate searching of XML documents, clustering XML documents, structural extracting of XML documents, change checking of XML documents, and incremental maintenance of XML views, etc.