Effectiveness of a Data-based Influence Maximization Algorithm Using Information Diffusion Cascades
暂无分享,去创建一个
As a long-standing topic of considerable research interest, influence maximization involves finding a few influential nodes in a social network. Building on the rich existing body of literature on influence maximization, a recent emerging trend is the use of real data from information diffusion cascades. However, although data-based influence maximization is expected to be a promising approach, its evaluation remains limited. In this paper, by using a Twitter dataset of retweets among ~0.3 million users for two years, the effectiveness of the data-based influence maximization algorithm DiffuGreedy is evaluated comprehensively. The key findings are as follows. 1) Compared with using an existing model-based influence maximization algorithm on the Twitter dataset, DiffuGreedy scores 20–30% higher in distinct nodes influenced, which is an index for evaluating the effectiveness of influence maximization algorithms. 2) When using a training period that is either too long (i.e., one year or longer) or too short (i.e., one day or shorter), DiffuGreedy is less effective than the existing model-based influence maximization algorithm. 3) There are some short-term influencers who are not influential in the training period but who eventually attract attention from other users in the test period, and finding these short-term influencers is key to improving the effectiveness of DiffuGreedy. However, the present results suggest that finding short-term influencers is a difficult task.