Text Retrieval-Based Tagging of Software Engineering Video Tutorials

Video tutorials are an emerging form of documentation in software engineering and can efficiently provide developers with useful information needed for their daily tasks. However, to get the information they need, developers have to find the right tutorial for their task at hand. Currently, there is little information available to quickly judge whether a tutorial is relevant to a topic or helpful to the task at hand, which can lead to missing the best tutorials and wasting time watching irrelevant ones. We present the first efforts towards new tagging approaches using text retrieval that describe the contents of software engineering video tutorials, making it easier and faster to understand their purpose and contents. We also present the results of a preliminary evaluation of thirteen such approaches, revealing the potential of some and limitations of others.

[1]  Gabriele Bavota,et al.  Too Long; Didn't Watch! Extracting Relevant Fragments from Software Development Video Tutorials , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[2]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[3]  Margaret-Anne D. Storey,et al.  Code, Camera, Action: How Software Developers Document and Share Program Knowledge Using YouTube , 2015, 2015 IEEE 23rd International Conference on Program Comprehension.

[4]  Nicole Bauer,et al.  Information Retrieval Implementing And Evaluating Search Engines , 2016 .

[5]  Sheng Tang,et al.  LDA-Based Retrieval Framework for Semantic News Video Retrieval , 2007, International Conference on Semantic Computing (ICSC 2007).

[6]  Luciano Sbaiz,et al.  Finding meaning on YouTube: Tag recommendation and category discovery , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[7]  Harry W. Agius,et al.  Video summarisation: A conceptual framework and survey of the state of the art , 2008, J. Vis. Commun. Image Represent..

[8]  Hugo Zaragoza,et al.  The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..

[9]  Andrea De Lucia,et al.  How to effectively use topic models for software engineering tasks? An approach based on Genetic Algorithms , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[10]  Karen Spärck Jones A statistical interpretation of term specificity and its application in retrieval , 2021, J. Documentation.

[11]  Sheng Tang,et al.  LDA-Based Retrieval Framework for Semantic News Video Retrieval , 2007 .

[12]  Hyunjo Lee,et al.  Context-Aware Architecture for Intelligent Application Services in Ubiquitous Computing , 2007 .

[13]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.