On Uncertain Graphs Modeling and Queries

Large-scale, highly-interconnected networks pervade both our society and the natural world around us. Uncertainty, on the other hand, is inherent in the underlying data due to a variety of reasons, such as noisy measurements, lack of precise information needs, inference and prediction models, or explicit manipulation, e.g., for privacy purposes. Therefore, uncertain, or probabilistic, graphs are increasingly used to represent noisy linked data in many emerging application scenarios, and they have recently become a hot topic in the database research community. While many classical graph algorithms such as reachability and shortest path queries become #P-complete, and hence, more expensive in uncertain graphs; various complex queries are also emerging over uncertain networks, such as pattern matching, information diffusion, and influence maximization queries. In this tutorial, we discuss the sources of uncertain graphs and their applications, uncertainty modeling, as well as the complexities and algorithmic advances on uncertain graphs processing in the context of both classical and emerging graph queries. We emphasize the current challenges and highlight some future research directions.

[1]  Lei Chen,et al.  Pattern Match Query in a Large Uncertain Graph , 2014, CIKM.

[2]  Tim Kraska,et al.  CrowdER: Crowdsourcing Entity Resolution , 2012, Proc. VLDB Endow..

[3]  Yinghui Wu,et al.  Emerging Graph Queries in Linked Data , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[4]  Tamir Tassa,et al.  Injecting Uncertainty in Graphs for Identity Obfuscation , 2012, Proc. VLDB Endow..

[5]  Leslie G. Valiant,et al.  The Complexity of Enumeration and Reliability Problems , 1979, SIAM J. Comput..

[6]  Haixun Wang,et al.  Distance-Constraint Reachability Computation in Uncertain Graphs , 2011, Proc. VLDB Endow..

[7]  Hannu Toivonen,et al.  Link Discovery in Graphs Derived from Biological Databases , 2006, DILS.

[8]  George Kollios,et al.  k-nearest neighbors in uncertain graphs , 2010, Proc. VLDB Endow..

[9]  Charu C. Aggarwal,et al.  Managing and Mining Uncertain Data , 2009, Advances in Database Systems.

[10]  Xiang Lian,et al.  Efficient query answering in probabilistic RDF graphs , 2011, SIGMOD '11.

[11]  Ioana Manolescu,et al.  Cloud-based RDF data management , 2014, SIGMOD Conference.

[12]  Jon Kleinberg,et al.  Maximizing the spread of influence through a social network , 2003, KDD '03.

[13]  Haixun Wang,et al.  Efficient subgraph search over large uncertain graphs , 2011, Proc. VLDB Endow..

[14]  Lise Getoor,et al.  Subgraph pattern matching over uncertain graphs with identity linkage uncertainty , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[15]  Jianzhong Li,et al.  Discovering frequent subgraphs over uncertain graph databases under probabilistic semantics , 2010, KDD.

[16]  Jian Pei,et al.  Probabilistic path queries in road networks: traffic uncertainty aware path selection , 2010, EDBT '10.

[17]  Dimitris Papadias,et al.  The pursuit of a good possible world: extracting representative instances of uncertain graphs , 2014, SIGMOD Conference.

[18]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[19]  Haixun Wang,et al.  Efficient Subgraph Similarity Search on Large Probabilistic Graph Databases , 2012, Proc. VLDB Endow..

[20]  Jian Pei,et al.  Query answering techniques on uncertain and probabilistic data: tutorial summary , 2008, SIGMOD Conference.

[21]  Christopher Ré,et al.  Managing Uncertainty in Social Networks , 2007, IEEE Data Eng. Bull..