Querying graphs with uncertain predicates

In many applications the available data give rise to an attributed graph, with the nodes corresponding to the entities of interest, edges to their relationships and attributes on both provide additional characteristics. To mine such data structures we have proposed a visual analytic algebra that enhances the atomic operators of selection, aggregation and a visualization step that allows the user to interact with the data. However, in many settings the user has a certain degree of uncertainty about the desired query; the problem is further compounded if the final results are the product of a series of such uncertain queries. To address this issue, we introduce a probabilistic framework that incorporates uncertainty in the queries and provides a probabilistic assessment of the likelihood of the obtained outcomes. We discuss its technical characteristics and illustrate it on a number of examples.

[1]  Jennifer Widom,et al.  Trio: A System for Integrated Management of Data, Accuracy, and Lineage , 2004, CIDR.

[2]  Lise Getoor,et al.  Link mining: a survey , 2005, SKDD.

[3]  Ali A. Ghorbani,et al.  A Frequency Mining-Based Algorithm for Re-ranking Web Search Engine Retrievals , 2008, Canadian Conference on AI.

[4]  Jignesh M. Patel,et al.  TALE: A Tool for Approximate Large Graph Matching , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[5]  Federica Mandreoli,et al.  Flexible query answering on graph-modeled data , 2009, EDBT '09.

[6]  Olfa Nasraoui,et al.  Mining search engine query logs for query recommendation , 2006, WWW '06.

[7]  Jean Scholtz,et al.  VAST 2009 challenge: An insider threat , 2009, 2009 IEEE Symposium on Visual Analytics Science and Technology.

[8]  Mona Singh,et al.  Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps , 2005, ISMB.

[9]  Ali A. Ghorbani,et al.  On Query Completion in Web Search Engines Based on Query Stream Mining , 2007, IEEE/WIC/ACM International Conference on Web Intelligence (WI'07).

[10]  Hao Zhou,et al.  Algebraic visual analysis: the Catalano phone call data set case study , 2009, VAKD '09.

[11]  Matthew Michelson,et al.  Flexible query formulation for federated search ∗ , 2009 .

[12]  Ben Taskar,et al.  Learning Probabilistic Models of Link Structure , 2003, J. Mach. Learn. Res..

[13]  Philip S. Yu,et al.  Substructure similarity search in graph databases , 2005, SIGMOD '05.

[14]  Rahul Gupta,et al.  Creating probabilistic databases from information extraction models , 2006, VLDB.

[15]  Divesh Srivastava,et al.  Keyword proximity search in XML trees , 2006 .