Identifying Textual Features of High-Quality Questions: An Empirical Study on Stack Overflow

Background: Stack Overflow (SO) is a programming-specific Q&A website that serves as a valuable repository of software engineering knowledge. For SO members, formulating a good question is the first step towards eliciting satisfactory responses. Aims: To guide SO members on how to make a good question, we conduct an empirical study using the publicly available Stack Overflow Data Dump for the period of 2008-2016. Method: We first choose 25 features along 5 dimensions to represent the textual characteristics that we are interested in. Making use of the Boruta algorithm, we then capture all features that are either strongly or weakly relevant to the question quality. Results: The results show that the number of tags and code snippets are the most discriminative features, whereas there is only a weak correlation between the question quality and the sentiment-related factors. Based on the empirical evidence, we provide useful and usable suggestions to SO members on how to optimize their questions. Conclusions: We consider that our findings will provide SO members with a better understanding of the patterns behind high-quality questions, this is to support effective and efficient utilization of Q&A websites as the ultimate goal.

[1]  Ashish Sureka,et al.  Chaff from the wheat: characterization and modeling of deleted questions on stack overflow , 2014, WWW.

[2]  Michele Lanza,et al.  Improving Low Quality Stack Overflow Post Detection , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[3]  Alexander Serebrenik,et al.  Choosing your weapons: On sentiment analysis tools for software engineering research , 2015, 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[4]  Witold R. Rudnicki,et al.  Feature Selection with the Boruta Package , 2010 .

[5]  Feng Xu,et al.  Detecting high-quality posts in community question answering sites , 2015, Inf. Sci..

[6]  Nicole Novielli,et al.  Sentiment Polarity Detection for Software Development , 2017, Empirical Software Engineering.

[7]  Mike Thelwall,et al.  Sentiment in short strength detection informal text , 2010 .

[8]  Akinori Ihara,et al.  Understanding Question Quality through Affective Aspect in Q&A Site , 2016, 2016 IEEE/ACM 1st International Workshop on Emotional Awareness in Software Engineering (SEmotion).

[9]  Peter Totterdell,et al.  Eliciting mixed emotions: a meta-analysis comparing models, types, and measures , 2015, Front. Psychol..

[10]  Chanchal Kumar Roy,et al.  Answering questions about unanswered questions of Stack Overflow , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[11]  Philip M. McCarthy,et al.  MTLD, vocd-D, and HD-D: A validation study of sophisticated approaches to lexical diversity assessment , 2010, Behavior research methods.

[12]  Eleni Stroulia,et al.  On the Personality Traits of StackOverflow Users , 2013, 2013 IEEE International Conference on Software Maintenance.

[13]  Ahmed E. Hassan,et al.  What are developers talking about? An analysis of topics and trends in Stack Overflow , 2014, Empirical Software Engineering.

[14]  Nicole Novielli,et al.  Mining Successful Answers in Stack Overflow , 2015, 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories.

[15]  David Lo,et al.  What are the characteristics of high-rated apps? A case study on free Android Applications , 2015, 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[16]  Young-In Song,et al.  Question Utility: A Novel Static Ranking of Question Search , 2008, AAAI.

[17]  Pável Calado,et al.  Exploiting user feedback to learn to rank answers in q&a forums: a case study with stack overflow , 2013, SIGIR.

[18]  Alberto Bacchelli,et al.  Quality Questions Need Quality Code: Classifying Code Fragments on Stack Overflow , 2015, 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories.

[19]  G. Harry McLaughlin,et al.  SMOG Grading - A New Readability Formula. , 1969 .

[20]  Sanjeev Sabharwal,et al.  Assessing Readability of Patient Education Materials: Current Role in Orthopaedics , 2010, Clinical orthopaedics and related research.

[21]  Feng Xu,et al.  Want a Good Answer? Ask a Good Question First! , 2013, ArXiv.

[22]  Alessandro Bozzon,et al.  Asking the right question in collaborative q&a systems , 2014, HT.

[23]  John Mylopoulos,et al.  Learning to Rank for Question-Oriented Software Text Retrieval (T) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[24]  R. Flesch A new readability yardstick. , 1948, The Journal of applied psychology.