Skyline queries over possibilistic RDF data

Abstract Volume and veracity of data on the Web are two main issues in managing information. In this paper, we tackle these two issues, with a particular interest to Resource Description Framework (RDF) data. For veracity management, we rely on a powerful uncertainty theory, namely possibility theory. Therefore, we propose a model for representing and managing possibilistic RDF data. Alongside, to filter the massive amount of RDF data, we use the skyline operator to find out a small set of resources that satisfy predefined user preferences. To this aim, we also propose a skyline operator to extract possibilistic RDF resources that are possibly dominated by no other resources according to Pareto dominance definition. We introduce a dominance operator and a skyline model adopted to the aforementioned kind of data. In addition, we propose an efficient algorithm to compute the skyline with a reasonable performance. Experiments led on the skyline computation showed satisfying results.

[1]  Guilin Qi,et al.  Tableaux Algorithms for Expressive Possibilistic Description Logics , 2013, 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).

[2]  Philippe Smets,et al.  Imperfect Information: Imprecision and Uncertainty , 1996, Uncertainty Management in Information Systems.

[3]  Xuemin Lin,et al.  Skyline probability over uncertain preferences , 2013, EDBT '13.

[4]  Jan Chomicki,et al.  Querying with Intrinsic Preferences , 2002, EDBT.

[5]  Henri Prade,et al.  Lipski's approach to incomplete information databases restated and generalized in the setting of Zadeh's possibility theory , 1984, Inf. Syst..

[6]  Jan Chomicki,et al.  Skyline queries, front and back , 2013, SGMD.

[7]  H.-J. Zimmermann Possibility Theory vs. Probability Theory , 1985 .

[8]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[9]  Didier Dubois,et al.  Possibility Theory, Probability Theory and Multiple-Valued Logics: A Clarification , 2001, Annals of Mathematics and Artificial Intelligence.

[10]  Lei Zou,et al.  Dynamic Skyline Queries in Large Graphs , 2010, DASFAA.

[11]  Mauro Dragoni,et al.  Evolutionary algorithms for reasoning in fuzzy description logics with fuzzy quantifiers , 2007, GECCO '07.

[12]  Lei Zou,et al.  Efficient Subgraph Skyline Search Over Large Graphs , 2014, CIKM.

[13]  Bin Jiang,et al.  Probabilistic skylines on uncertain data: model and bounding-pruning-refining methods , 2010, Journal of Intelligent Information Systems.

[14]  Guillaume Blin,et al.  A survey of RDF storage approaches , 2012, ARIMA J..

[15]  Chengfei Liu,et al.  Query Evaluation on Probabilistic RDF Databases , 2009, WISE.

[16]  Umberto Straccia,et al.  pFOIL-DL: learning (fuzzy) EL concept descriptions from crisp OWL data using a probabilistic ensemble estimation , 2015, SAC.

[17]  Nicholas Gibbins,et al.  3store: Efficient Bulk RDF Storage , 2003, PSSS.

[18]  Patrick Bosc,et al.  On Possibilistic Skyline Queries , 2011, FQAS.

[19]  M. Tamer Özsu A survey of RDF data management systems , 2016, Frontiers of Computer Science.

[20]  Jan Chomicki Logical Foundations of Preference Queries , 2011, IEEE Data Eng. Bull..

[21]  Anthony K. H. Tung,et al.  Finding k-dominant skylines in high dimensional space , 2006, SIGMOD Conference.

[22]  Xiang Lian,et al.  Efficient query answering in probabilistic RDF graphs , 2011, SIGMOD '11.

[23]  Sherif Sakr,et al.  Relational processing of RDF queries: a survey , 2010, SGMD.