Efficient Algorithms for Skyline Top-K Keyword Queries on XML Streams

Keywords are suitable for query XML streams without schema information. In current forms of keywords search on XML streams, rank functions do not always represent users' intentions. This paper addresses this problem in another aspect. In this paper, the skyline top-K keyword queries, a novel kind of keyword queries on XML streams, are presented. For such queries, skyline is used to choose results on XML streams without considering the complicated factors influencing the relevance to queries. With skyline query processing techniques, algorithms are presented to process skyline top-K keyword queries on XML streams efficiently. Extensive experiments are performed to verify the effectiveness and efficiency of the algorithms presented.

[1]  Vagelis Hristidis,et al.  Keyword proximity search on XML graphs , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[2]  Feng Shao,et al.  XRANK: ranked keyword search over XML documents , 2003, SIGMOD '03.

[3]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[4]  Chun Zhang,et al.  Storing and querying ordered XML using a relational database system , 2002, SIGMOD '02.

[5]  Raymond K. Wong,et al.  Structural proximity searching for large collections of semi-structured data , 2001, CIKM '01.

[6]  Yehoshua Sagiv,et al.  XSEarch: A Semantic Search Engine for XML , 2003, VLDB.

[7]  Bin Jiang,et al.  Probabilistic Skylines on Uncertain Data , 2007, VLDB.