Does the hiding mechanism for Stack Overflow comments work well? No!

Stack Overflow has accumulated millions of answers. Informative comments can strengthen their associated answers (e.g., providing additional information). Currently, Stack Overflow hides comments that are ranked beyond the top 5. Stack Overflow aims to display more informative comments (i.e., the ones with higher scores) and hide less informative ones using this mechanism. As a result, 4.4 million comments are hidden under their answer threads. Therefore, it is very important to understand how well the current comment hiding mechanism works. In this study, we investigate whether the mechanism can effectively deliver informative comments while hiding uninformative comments. We find that: 1) Hidden comments are as informative as displayed comments; more than half of the comments (both hidden and displayed) are informative (e.g., providing alternative answers, or pointing out flaws in their associated answers). 2) The current comment hiding mechanism tends to rank and hide comments based on their creation time instead of their score in most cases due to the large amount of tie-scored comments (e.g., 87% of the comments have 0-score). 3) In 97.3% of answers that have hidden comments, at least one comment is hidden while there is another comment with the same score is displayed (i.e., we refer to such cases as unfairly hidden comments). Among such unfairly hidden comments, the longest unfairly hidden comment is more likely to be informative than the shortest unfairly displayed comments. Our findings suggest that Stack Overflow should consider adjusting their current comment hiding mechanism, e.g., displaying longer unfairly hidden comments to replace shorter unfairly displayed comments. We also recommend that users examine all comments, in case they would miss informative details such as software obsolescence, code error reports, or notices of security vulnerability in hidden comments.

[1]  Paul A. Watters,et al.  Statistics in a nutshell - a desktop quick reference , 2008 .

[2]  Ahmed E. Hassan,et al.  A survey on the use of topic models when mining software repositories , 2015, Empirical Software Engineering.

[3]  R. Merton The Matthew Effect in Science , 1968, Science.

[4]  Cor-Paul Bezemer,et al.  An empirical study of game reviews on the Steam platform , 2018, Empirical Software Engineering.

[5]  Haoxiang Zhang,et al.  An Empirical Study of Obsolete Answers on Stack Overflow , 2019, IEEE Transactions on Software Engineering.

[6]  Ahmed E. Hassan,et al.  What Do Mobile App Users Complain About? , 2015, IEEE Software.

[7]  Premkumar T. Devanbu,et al.  How social Q&A sites are changing knowledge sharing in open source software communities , 2014, CSCW.

[8]  David Lo,et al.  An empirical study on developer interactions in StackOverflow , 2013, SAC '13.

[9]  David Lo,et al.  Automatic recommendation of API methods from feature requests , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[10]  M. Mukaka,et al.  Statistics corner: A guide to appropriate use of correlation coefficient in medical research. , 2012, Malawi medical journal : the journal of Medical Association of Malawi.

[11]  David Schuff,et al.  What Makes a Helpful Review? A Study of Customer Reviews on Amazon.com , 2010 .

[12]  Rich Gazan,et al.  Microcollaborations in a social Q&A community , 2010, Inf. Process. Manag..

[13]  Aditya Pal,et al.  Routing questions for collaborative answering in Community Question Answering , 2013, 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013).

[14]  Jure Leskovec,et al.  Discovering value from community activity on focused question answering sites: a case study of stack overflow , 2012, KDD.

[15]  K. Gwet Inter-Rater Reliability: Dependency on Trait Prevalence and Marginal Homogeneity , 2002 .

[16]  Pável Calado,et al.  Exploiting user feedback to learn to rank answers in q&a forums: a case study with stack overflow , 2013, SIGIR.

[17]  Nicole Novielli,et al.  Mining Successful Answers in Stack Overflow , 2015, 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories.

[18]  Chanchal Kumar Roy,et al.  Answering questions about unanswered questions of Stack Overflow , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[19]  Forrest Shull,et al.  Defect categorization: making use of a decade of widely varying historical data , 2008, ESEM '08.

[20]  Baoxin Li,et al.  Towards Predicting the Best Answers in Community-based Question-Answering Services , 2013, ICWSM.

[21]  Andrea De Lucia,et al.  On the Equivalence of Information Retrieval Methods for Automated Traceability Link Recovery , 2010, 2010 IEEE 18th International Conference on Program Comprehension.

[22]  Zhenchang Xing,et al.  Concern Localization using Information Retrieval: An Empirical Study on Linux Kernel , 2011, 2011 18th Working Conference on Reverse Engineering.

[23]  Michele Lanza,et al.  Improving Low Quality Stack Overflow Post Detection , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[24]  David Lo,et al.  EnTagRec: An Enhanced Tag Recommendation System for Software Information Sites , 2014, ICSME.

[25]  David Lo,et al.  Tag recommendation in software information sites , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[26]  Grant Williams,et al.  Analyzing User Comments on YouTube Coding Tutorial Videos , 2017, 2017 IEEE/ACM 25th International Conference on Program Comprehension (ICPC).

[27]  Carolyn B. Seaman,et al.  Qualitative Methods in Empirical Studies of Software Engineering , 1999, IEEE Trans. Software Eng..

[28]  David Lo,et al.  Compositional Vector Space Models for Improved Bug Localization , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[29]  Mária Bieliková,et al.  Why is Stack Overflow Failing? Preserving Sustainability in Community Question Answering , 2016, IEEE Software.

[30]  Cor-Paul Bezemer,et al.  Studying Bad Updates of Top Free-to-Download Apps in the Google Play Store , 2020, IEEE Transactions on Software Engineering.

[31]  Ahmed E. Hassan,et al.  How Do Users Revise Answers on Technical Q&A Websites? A Case Study on Stack Overflow , 2020, IEEE Transactions on Software Engineering.

[32]  Andrea De Lucia,et al.  On integrating orthogonal information retrieval methods to improve traceability recovery , 2011, 2011 27th IEEE International Conference on Software Maintenance (ICSM).

[33]  Zhenchang Xing,et al.  What do developers search for on the web? , 2017, Empirical Software Engineering.

[34]  N. Cliff Dominance statistics: Ordinal analyses to answer ordinal questions. , 1993 .