Why, when, and what: Analyzing Stack Overflow questions by topic, type, and code

Questions from Stack Overflow provide a unique opportunity to gain insight into what programming concepts are the most confusing. We present a topic modeling analysis that combines question concepts, types, and code. Using topic modeling, we are able to associate programming concepts and identifiers (like the String class) with particular types of questions, such as, “how to perform encoding”.

[1]  Martin Porter,et al.  Snowball: A language for stemming algorithms , 2001 .

[2]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[3]  Charles A. Sutton,et al.  Mining source code repositories at massive scale using language modeling , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).