An Empirical Evaluation of Evaluation Metrics of Procedurally Generated Mario Levels

There are several approaches in the literature for automatically generating Infinite Mario Bros levels. The evaluation of such approaches is often performed solely with computational metrics such as leniency and linearity. While these metrics are important for an initial exploratory evaluation of the content generated, it is not clear whether they are able to capture the player’s perception of the content generated. In this paper we evaluate several of the commonly used computational metrics. Namely, we perform a systematic user study with procedural content generation systems and compare the insights gained from our user study with those gained from analyzing the computational metric values. The results of our experiment suggest that current computational metrics should not be used in lieu of user studies for evaluating content generated by computer programs.

[1]  Michael Mateas,et al.  Launchpad: A Rhythm-Based Level Generator for 2-D Platformers , 2011, IEEE Transactions on Computational Intelligence and AI in Games.

[2]  Julian Togelius,et al.  A comparative evaluation of procedural level generators in the Mario AI framework , 2014, FDG.

[3]  Julian Togelius,et al.  Linear levels through n-grams , 2014, MindTrek.

[4]  Shimon Whiteson,et al.  Challenge balancing for personalised game spaces , 2014, 2014 IEEE Games Media Entertainment.

[5]  Julian Togelius,et al.  Procedural Content Generation Using Patterns as Objectives , 2014, EvoApplications.

[6]  Julian Togelius,et al.  Digging Deeper into Platform Game Level Design: Session Size and Sequential Features , 2012, EvoApplications.

[7]  Bin Ma,et al.  The similarity metric , 2001, IEEE Transactions on Information Theory.

[8]  Gillian Smith,et al.  Analyzing the expressive range of a level generator , 2010, PCGames@FDG.

[9]  Julian Togelius,et al.  A procedural procedural level generator generator , 2012, 2012 IEEE Conference on Computational Intelligence and Games (CIG).

[10]  Julian Togelius,et al.  The Mario AI Championship 2009-2012 , 2013, AI Mag..

[11]  Michael Mateas,et al.  Procedural Level Design for Platform Games , 2006, AIIDE.

[12]  Julian Togelius,et al.  Modeling player experience in Super Mario Bros , 2009, 2009 IEEE Symposium on Computational Intelligence and Games.

[13]  Julian Togelius,et al.  Towards Automatic Personalized Content Generation for Platform Games , 2010, AIIDE.

[14]  R. Yerkes,et al.  The relation of strength of stimulus to rapidity of habit‐formation , 1908 .

[15]  Michael Mateas,et al.  Tanagra: a mixed-initiative level design tool , 2010, FDG.

[16]  Julian Togelius,et al.  Search-Based Procedural Content Generation: A Taxonomy and Survey , 2011, IEEE Transactions on Computational Intelligence and AI in Games.

[17]  Alessandro Canossa,et al.  Towards a Procedural Evaluation Technique: Metrics for Level Design , 2015, FDG.

[18]  Julian Togelius,et al.  Feature analysis for modeling game content quality , 2011, 2011 IEEE Conference on Computational Intelligence and Games (CIG'11).

[19]  Julian Togelius,et al.  Evolving Personalized Content for Super Mario Bros Using Grammatical Evolution , 2012, AIIDE.

[20]  Mark Claypool,et al.  Relating cognitive models of computer games to user evaluations of entertainment , 2009, FDG.

[21]  Noor Shaker,et al.  Alone We Can Do So Little, Together We Can Do So Much: A Combinatorial Approach for Generating Game Content , 2014, AIIDE.

[22]  Philippe Pasquier,et al.  A Generic Approach to Challenge Modeling for the Procedural Creation of Video Game Levels , 2011, IEEE Transactions on Computational Intelligence and AI in Games.

[23]  Michael Mateas,et al.  Procedural level generation using occupancy-regulated extension , 2010, Proceedings of the 2010 IEEE Conference on Computational Intelligence and Games.

[24]  Julian Togelius,et al.  A multi-level level generator , 2014, 2014 IEEE Conference on Computational Intelligence and Games.

[25]  Gillian Smith,et al.  A framework for analysis of 2D platformer levels , 2008, Sandbox '08.

[26]  Noor Shaker,et al.  Towards Understanding the Nonverbal Signatures of Engagement in Super Mario Bros , 2014, UMAP.

[27]  Chek Tien Tan,et al.  Inferring Player Experiences Using Facial Expressions Analysis , 2014, IE.

[28]  Julian Togelius,et al.  Evolving levels for Super Mario Bros using grammatical evolution , 2012, 2012 IEEE Conference on Computational Intelligence and Games (CIG).

[29]  Julian Togelius,et al.  Crowdsourcing the Aesthetics of Platform Games , 2013, IEEE Transactions on Computational Intelligence and AI in Games.

[30]  Julian Togelius,et al.  Patterns as Objectives for Level Generation , 2013 .

[31]  Ya'akov Gal,et al.  Human computation for procedural content generation in platform games , 2015, 2015 IEEE Conference on Computational Intelligence and Games (CIG).