Large-Scale Cross-Game Player Behavior Analysis on Steam

Behavioral game analytics has predominantly been confined to work on single games, which means that the cross-game applicability of current knowledge remains largely unknown. Here four experiments are presented focusing on the relationship between game ownership, time invested in playing games, and the players themselves, across more than 3000 games distributed by the Steam platform and over 6 million players, covering a total playtime of over 5 billion hours. Experiments are targeted at uncovering high-level patterns in the behavior of players focusing on playtime, using frequent itemset mining on game ownership, cluster analysis to develop playtime-dependent player profiles, correlation between user game rankings and, review scores, playtime and game ownership, as well as cluster analysis on Steam games. Within the context of playtime, the analyses presented provide unique insights into the behavior of game players as they occur across games, for example in how players distribute their time across games. Introduction and Contribution Game companies today are able to collect behavioral telemetry data from entire populations of players, and using cloud based storage technologies, it is possible to collect and process every single user event from games. Furthermore, with the help of global game platforms; such as Steam, Good Old Games, or console-based services, as well as social networking platforms like Facebook or Tango, increasingly larger and broader audiences can be reached. However, despite a remarkable growth of interest, fueled by new business models (notably Free-to-Play, F2P) and mobile technologies, publicly available behavioral analytics in digital games has as yet been predominantly confined to single games. Unlike other sectors such as e-commerce, there have been no large-scale cross-game behavioral studies, in part due to the recent, if highly accelerated, introduction of analytics practices in the game industry, but perhaps more importantly due to the confidentiality associated with behavioral telemetry data. This means that while dozens of telemetry-based studies and hundreds of observational studies of behavior in games have been published or presented, there is very Copyright c © 2015, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. little knowledge available in the public domain about how these translate across games (exceptions including (Chambers and Saha 2005; Bauckhage et al. 2012)). This places challenges in the way of establishing behavioral patterns that operate across some or all games, also for applied purposes such as informing game design, for example via improving retention and engagement (Bauckhage et al. 2012; Pittman and GauthierDickey 2010; Feng and Saha 2007). It also limits the ability to develop techniques used in e.g. e-commerce and behavioral economics for understanding and modeling user behavior (Resnick and Varian 1997; Ricci et al. 2011; Bogers 2009). The importance of crossgames behavioral analysis is emphasized when considering the increasing number of available platforms that offer games, and that the same players tend to own multiple games. Understanding how games are played is not a trivial task considering that multiple gameplay profiles can be observed from individual players. In this paper we present four experiments performed on a 6 million player dataset, covering a total playtime of over 5 billion hours of play across more than 3000 games distributed via the Steam platform. Additional data was collected covering game rankings and review scores, as well as information on the genre, type and key game mechanics. The results provide insights into the patterns around playtime in the games bought and played by Steam users, as well as patterns about the users themselves. Playtime is the focus of the experiments conducted because this feature is an indication of player interest or engagement with a game. In a highly competitive global marketplace for games, understanding the connections between the games played by a user, not just within any one game, is vital e.g. for tasks such as cross-game promotions, migrating players between games (Sifa, Ojeda, and Bauckhage 2015) or game recommender systems (Sifa, Bauckhage, and Drachen 2014a). Summarizing the results: 1) Playtime distribution players: Cluster analysis shows that the majority of players are more or less dedicated to one or a few games. Only about a third of the players put similar amounts of time into a variety of games (given a k = 11 solution). Playtime distribution is highly skewed. The number of owned games is distributed in a similar way. The average number of owned games is 22.1 (standard deviation = 35.5). 2) Playtime distribution games: Cluster results run on aggregate playtime Proceedings, The Eleventh AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE-15)

[1]  Boi Faltings,et al.  Hidden Markov models for churn prediction , 2015, 2015 SAI Intelligent Systems Conference (IntelliSys).

[2]  Christian Bauckhage,et al.  Clustering Game Behavior Data , 2015, IEEE Transactions on Computational Intelligence and AI in Games.

[3]  C. Ji An Archetypal Analysis on , 2005 .

[4]  Sophie Ahrens,et al.  Recommender Systems , 2012 .

[5]  Christian Bauckhage,et al.  How players lose interest in playing a game: An empirical study based on distributions of total playing times , 2012, 2012 IEEE Conference on Computational Intelligence and Games (CIG).

[6]  Peng Gao,et al.  Churn prediction for high-value players in casual social games , 2014, 2014 IEEE Conference on Computational Intelligence and Games.

[7]  Bracha Shapira,et al.  Recommender Systems Handbook , 2015, Springer US.

[8]  Chris GauthierDickey,et al.  Characterizing Virtual Populations in Massively Multiplayer Online Role-Playing Games , 2010, MMM.

[9]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[10]  Christian Bauckhage,et al.  Archetypal Game Recommender Systems , 2014, LWA.

[11]  Christian Bauckhage,et al.  Predicting player churn in the wild , 2014, 2014 IEEE Conference on Computational Intelligence and Games.

[12]  Tim Fields,et al.  Social Game Design: Monetization Methods and Mechanics , 2011 .

[13]  Debanjan Saha,et al.  A long-term study of a popular MMORPG , 2007, NetGames '07.

[14]  Christian Bauckhage,et al.  Guns, swords and data: Clustering of player behavior in computer games in the wild , 2012, 2012 IEEE Conference on Computational Intelligence and Games (CIG).

[15]  A.P.J. van den Bosch,et al.  Recommender Systems for Social Bookmarking , 2005 .

[16]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[17]  Christian Bauckhage,et al.  The Playtime Principle: Large-scale cross-games interest modeling , 2014, 2014 IEEE Conference on Computational Intelligence and Games.

[18]  Christian Borgelt,et al.  Frequent item set mining , 2012, WIREs Data Mining Knowl. Discov..

[19]  Christian Bauckhage,et al.  User Churn Migration Analysis with DEDICOM , 2015, RecSys.

[20]  Christian Bauckhage,et al.  Behavior evolution in Tomb Raider Underworld , 2013, 2013 IEEE Conference on Computational Inteligence in Games (CIG).

[21]  Simon Colton,et al.  Mining Rules from Player Experience and Activity Data , 2012, AIIDE.

[22]  Debanjan Saha,et al.  Measurement-based characterization of a collection of on-line games , 2005, IMC '05.

[23]  Christian Bauckhage,et al.  A comparison of methods for player clustering via behavioral telemetry , 2013, FDG.