An Automated Approach to Assessing an Application Tutorial’s Difficulty

Online step-by-step text and video tutorials play an integral role in learning feature-rich software applications. However, when searching, users can find it difficult to assess whether a tutorial is designed for their level of software expertise. Novice users can struggle when a tutorial is out of their reach, whereas more advanced users can end up wasting time with overly simple, first-principles instruction. To assist users in selecting tutorials, we investigate the feasibility of using machine-learning techniques to automatically assess a tutorial’s difficulty. Using Photoshop as our primary testbed, we develop a set of distinguishable tutorial features, and use these features to train a classifier that can label a tutorial as either Beginner or Advanced with 85% accuracy. To illustrate a potential application, we developed a tutorial browsing interface called TutVis. Our initial user evaluation provides insight into TutVis’s ability to support users in a range of tutorial selection scenarios.

[1]  Adam E M Eltorai,et al.  Readability of Invasive Procedure Consent Forms , 2015, Clinical and translational science.

[2]  Andrea Bunt,et al.  Exploring Personalized Command Recommendations based on Information Found in Web Documentation , 2015, IUI.

[3]  Scott E. Hudson,et al.  Dynamic detection of novice vs. skilled use without a task model , 2007, CHI.

[4]  Caitlin Kelleher,et al.  Stencils-based tutorials: design and evaluation , 2005, CHI.

[5]  Abram Hindle,et al.  Do topics make sense to managers and developers? , 2014, Empirical Software Engineering.

[6]  Tovi Grossman,et al.  Chronicle: capture, exploration, and playback of document workflow histories , 2010, UIST.

[7]  Dirk Van den Poel,et al.  FACULTEIT ECONOMIE , 2007 .

[8]  Andrea Bunt,et al.  Characterizing Web-Based Tutorials: Exploring Quality, Community, and Showcasing Strategies , 2014, SIGDOC.

[9]  Gabriele Bavota,et al.  Too Long; Didn't Watch! Extracting Relevant Fragments from Software Development Video Tutorials , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[10]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[11]  Arin Ghazarian,et al.  Automatic detection of users’ skill levels using high-frequency user interface events , 2010, User Modeling and User-Adapted Interaction.

[12]  ChengXiang Zhai,et al.  Automatic labeling of multinomial topic models , 2007, KDD '07.

[13]  Björn Hartmann,et al.  ShowMeHow: translating user interface instructions between applications , 2011, UIST.

[14]  Philipp Probst,et al.  Hyperparameters and tuning strategies for random forest , 2018, WIREs Data Mining Knowl. Discov..

[15]  Trevor Hastie,et al.  Multi-class AdaBoost ∗ , 2009 .

[16]  Vimala Balakrishnan,et al.  Stemming and lemmatization: A comparison of retrieval performances , 2014 .

[17]  Adam Fourney,et al.  "Then click ok!": extracting references to interface elements in online documentation , 2012, CHI.

[18]  Timothy Baldwin,et al.  Automatic Labelling of Topic Models , 2011, ACL.

[19]  Tovi Grossman,et al.  Design and evaluation of a command recommendation system for software applications , 2011, TCHI.

[20]  E. A. Locke,et al.  Building a practically useful theory of goal setting and task motivation. A 35-year odyssey. , 2002, The American psychologist.

[21]  Krzysztof Z. Gajos,et al.  Data-driven interaction techniques for improving navigation of educational videos , 2014, UIST.

[22]  Timothy Baldwin,et al.  Automatic Evaluation of Topic Coherence , 2010, NAACL.

[23]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[24]  Mira Dontcheva,et al.  Pause-and-play: automatically linking screencast video tutorials with applications , 2011, UIST.

[25]  Tovi Grossman,et al.  GamiCAD: a gamified tutorial system for first time autocad users , 2012, UIST.

[26]  Tovi Grossman,et al.  Community enhanced tutorials: improving tutorials with multiple demonstrations , 2013, CHI.

[27]  Tovi Grossman,et al.  An Investigation of Metrics for the In Situ Detection of Software Expertise , 2015, Hum. Comput. Interact..

[28]  Krzysztof Z. Gajos,et al.  Learnersourcing Subgoal Labels for How-to Videos , 2015, CSCW.

[29]  Xu Wang,et al.  Leveraging Community-Generated Videos and Command Logs to Classify and Recommend Software Workflows , 2018, CHI.

[30]  Tovi Grossman,et al.  A survey of software learnability: metrics, methodologies and guidelines , 2009, CHI.

[31]  Stephen G. Powell,et al.  A comparison of spreadsheet users with different levels of experience , 2009 .

[32]  Tovi Grossman,et al.  Waken: reverse engineering usage information and interface structure from software videos , 2012, UIST '12.

[33]  Eric Horvitz,et al.  The Lumière Project: Bayesian User Modeling for Inferring the Goals and Needs of Software Users , 1998, UAI.

[34]  R. Mayer,et al.  Nine Ways to Reduce Cognitive Load in Multimedia Learning , 2003 .

[35]  Andrea Bunt,et al.  Understanding the Roles and Uses of Web Tutorials , 2013, ICWSM.

[36]  Thomas S. Tullis,et al.  Generation Y, web design, and eye tracking , 2010, Int. J. Hum. Comput. Stud..

[37]  Tovi Grossman,et al.  Searching for software learning resources using application context , 2011, UIST.

[38]  Lynne Cooke,et al.  Assessing Concurrent Think-Aloud Protocol as a Usability Test Method: A Technical Communication Approach , 2010, IEEE Transactions on Professional Communication.

[39]  Jakob Nielsen,et al.  Usability engineering , 1997, The Computer Science and Engineering Handbook.

[40]  D. Bolliger,et al.  Learning styles and student perceptions of the use of interactive online tutorials , 2011 .

[41]  Pei-Yu Chi,et al.  MixT: automatic generation of step-by-step mixed media tutorials , 2012, CHI Extended Abstracts.

[42]  Gale Moore,et al.  Are We All In the Same "Bloat"? , 2000, Graphics Interface.

[43]  Francisco Herrera,et al.  An overview of ensemble methods for binary classifiers in multi-class problems: Experimental study on one-vs-one and one-vs-all schemes , 2011, Pattern Recognit..

[44]  Simon Fraser,et al.  Social CheatSheet: An Interactive Community-Curated Information Overlay for Web Applications , 2017 .

[45]  Jun Gong,et al.  Instrumenting and Analyzing Fabrication Activities, Users, and Expertise , 2019, CHI.

[46]  Måns Magnusson,et al.  Pulling Out the Stops: Rethinking Stopword Removal for Topic Models , 2017, EACL.

[47]  David Wick,et al.  Cross-user analysis: Benefits of skill level comparison in usability testing , 2005, Interact. Comput..

[48]  Michal Munk,et al.  Influence of Stop-Words Removal on Sequence Patterns Identification within Comparable Corpora , 2013, ICT Innovations.

[49]  Ruslan Salakhutdinov,et al.  Evaluation methods for topic models , 2009, ICML '09.

[50]  Shahed Sabab An investigation on automatically assessing an application tutorial’s difficulty , 2019 .

[51]  F. Paas,et al.  Cognitive Load Theory and Instructional Design: Recent Developments , 2003 .

[52]  Chih-Jen Lin,et al.  Probability Estimates for Multi-class Classification by Pairwise Coupling , 2003, J. Mach. Learn. Res..

[53]  Serkan Günal,et al.  The impact of preprocessing on text classification , 2014, Inf. Process. Manag..

[54]  Andrea Bunt,et al.  Switter: Supporting Exploration of Software Learning Materials on Social Media , 2016, Conference on Designing Interactive Systems.

[55]  Andrea Bunt,et al.  TaggedComments: promoting and integrating user comments in online application tutorials , 2014, CHI.

[56]  Andrew McCallum,et al.  Optimizing Semantic Coherence in Topic Models , 2011, EMNLP.

[57]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[58]  Krzysztof Z. Gajos,et al.  Crowdsourcing step-by-step information extraction to enhance existing how-to videos , 2014, CHI.

[59]  Ramesh Nallapati,et al.  Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora , 2009, EMNLP.

[60]  Timothy Baldwin,et al.  Evaluating topic models for digital libraries , 2010, JCDL '10.

[61]  Björn Hartmann,et al.  Browsing and Analyzing the Command-Level Structure of Large Collections of Image Manipulation Tutorials , 2013 .

[62]  Ryen W. White,et al.  Characterizing the influence of domain expertise on web search behavior , 2009, WSDM '09.

[63]  Björn Hartmann,et al.  Delta: a tool for representing and comparing workflows , 2012, CHI.

[64]  Andrea Bunt,et al.  Beyond "One-Size-Fits-All": Understanding the Diversity in How Software Newcomers Discover and Make Use of Help Resources , 2019, CHI.