Mining User-Generated Path Traversal Patterns in an Information Network

This paper studies patterns occurring in user-generated click paths within the online encyclopedia Wikipedia. The click path data originates from over seven million goal-oriented clicks gathered from the Wiki Game, an online game in which the goal is to find a path between two given random Wikipedia articles. First we propose to use node-based path traversal patterns to derive a new measure of node centrality, arguing that a node is central if it proves useful in navigating through the network. A comparison with centrality measures from literature is provided, showing that users generally "know" only a relatively small portion of the network, which they employ frequently in finding their goal, and that this set of nodes differs significantly from the set of central nodes according to various centrality measures. Next, using the notion of sub graph centrality, we show that users are able to identify a small yet efficient portion of the graph that is useful for successfully completing their navigation goals.