Database Systems for Advanced Applications

Latent entity associations (EA) represent that two entities associate with each other indirectly through multiple intermediate entities in different textual Web contents (TWCs) including e-mails, Web news, social network pages, etc. In this paper, by adopting Bayesian Network as the framework to represent and infer latent EAs as well as the probabilities of associations, we propose the concept of entity association Bayesian Network (EABN). To construct EABN efficiently, we employ self-organizing map for TWC dataset division to make the co-occurrence-based dependence of each pair of entities concern just a small set of documents. Using probabilistic inferences of EABN, we evaluate and rank EAs in all possible entity pairs, by which novel latent EAs could be found. Experimental results show the effectiveness and efficiency of our approach.