Toward Better Summarizing Bug Reports With Crowdsourcing Elicited Attributes

Recent years have witnessed the growing demands for resolving numerous bug reports in software maintenance. Aiming to reduce the time testers/developers take in perusing bug reports, the task of bug report summarization has attracted a lot of research efforts in the literature. However, no systematic analysis has been conducted on attribute construction, which heavily impacts the performance of supervised algorithms for bug report summarization. In this study, we first conduct a survey to reveal the existing methods for attribute construction in mining software repositories. Then, we propose a new method named Crowd-Attribute to infer new effective attributes from the crowd-generated data in crowdsourcing and develop a new tool named Crowdsourcing Software Engineering Platform to facilitate this method. With Crowd-Attribute, we successfully construct 11 new attributes and propose a new supervised algorithm named Logistic Regression with Crowdsourced Attributes (LRCA). To evaluate the effectiveness of LRCA, we build a series of large scale datasets with 105 177 bug reports. Experiments over both the public dataset SDS with 36 manually annotated bug reports and new large-scale datasets demonstrate that LRCA can consistently outperform the state-of-the-art algorithms for bug report summarization.

[1]  Martin P. Robillard,et al.  Creating and evolving developer documentation: understanding the decisions of open source contributors , 2010, FSE '10.

[2]  Sunghun Kim,et al.  Reducing Features to Improve Code Change-Based Bug Prediction , 2013, IEEE Transactions on Software Engineering.

[3]  Irma Borst Understanding Crowdsourcing: Effects of motivation and rewards on participation and performance in voluntary online activities , 2010 .

[4]  Jade Goldstein-Stewart,et al.  The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries , 1998, SIGIR Forum.

[5]  Ron Kohavi,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998 .

[6]  B. Wansink,et al.  Asking Questions: The Definitive Guide to Questionnaire Design -- For Market Research, Political Polls, and Social and Health Questionnaires , 2004 .

[7]  Daniela E. Damian,et al.  StakeSource2.0: using social networks of stakeholders to identify and prioritise requirements , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[8]  Panagiotis G. Ipeirotis,et al.  Quality management on Amazon Mechanical Turk , 2010, HCOMP '10.

[9]  DingYing,et al.  Improving Automated Bug Triaging with Specialized Topic Model , 2017 .

[10]  Xiaochen Li,et al.  What Causes My Test Alarm? Automatic Cause Analysis for Test Alarms in System and Integration Testing , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[11]  Panagiotis G. Ipeirotis Analyzing the Amazon Mechanical Turk marketplace , 2010, XRDS.

[12]  Dragomir R. Radev,et al.  DivRank: the interplay of prestige and diversity in information networks , 2010, KDD.

[13]  Natalia Juristo Juzgado,et al.  Are Students Representatives of Professionals in Software Engineering Experiments? , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[14]  David A. Freedman,et al.  Statistical Models: Theory and Practice: References , 2005 .

[15]  Masoud Nikravesh,et al.  Feature Extraction - Foundations and Applications , 2006, Feature Extraction.

[16]  A. Saenz-Otero,et al.  SPHERES Zero Robotics software development: Lessons on crowdsourcing and collaborative competition , 2012, 2012 IEEE Aerospace Conference.

[17]  Tao Zhang,et al.  Towards Semi-automatic Bug Triage and Severity Prediction Based on Topic Model and Multi-feature of Bug Reports , 2014, 2014 IEEE 38th Annual Computer Software and Applications Conference.

[18]  Song Wang,et al.  Local-based active classification of test report to assist crowdsourced testing , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[19]  Ying Zou,et al.  Towards just-in-time suggestions for log changes , 2016, Empirical Software Engineering.

[20]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Phillip A. Laplante,et al.  A Literature Review of Research in Software Defect Reporting , 2013, IEEE Transactions on Reliability.

[22]  Tao Zhang,et al.  Bug Reports for Desktop Software and Mobile Apps in GitHub: What's the Difference? , 2019, IEEE Software.

[23]  Tao Zhang,et al.  A Novel Developer Ranking Algorithm for Automatic Bug Triage Using Topic Model and Developer Relations , 2014, 2014 21st Asia-Pacific Software Engineering Conference.

[24]  Lu Zhang,et al.  Can big data bring a breakthrough for software automation? , 2018, Science China Information Sciences.

[25]  Mark Harman,et al.  A survey of the use of crowdsourcing in software engineering , 2017, J. Syst. Softw..

[26]  Yang Feng,et al.  Multi-objective test report prioritization using image understanding , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[27]  Gail C. Murphy,et al.  Automatic Summarization of Bug Reports , 2014, IEEE Transactions on Software Engineering.

[28]  Sergey Brin,et al.  Reprint of: The anatomy of a large-scale hypertextual web search engine , 2012, Comput. Networks.

[29]  Wei-Tek Tsai,et al.  Creative software crowdsourcing: from components and algorithm development to project concept formations , 2013, Int. J. Creative Comput..

[30]  Rick Kazman,et al.  The metropolis model and its implications for the engineering of software ecosystems , 2010, FoSER '10.

[31]  Hiroshi Motoda,et al.  Computational Methods of Feature Selection , 2022 .

[32]  Eric Schenk,et al.  Crowdsourcing: What can be Outsourced to the Crowd, and Why ? , 2009 .

[33]  Siau-Cheng Khoo,et al.  A discriminative model approach for accurate duplicate bug report retrieval , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[34]  David Lo,et al.  Duplicate bug report detection with a combination of information retrieval and topic modeling , 2012, 2012 Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering.

[35]  Hoang Pham,et al.  A New Warranty Policy With Failure Times and Warranty Servicing Times , 2012, IEEE Transactions on Reliability.

[36]  He Jiang,et al.  Towards Effective Bug Triage with Software Data Reduction Techniques , 2017, IEEE Transactions on Knowledge and Data Engineering.

[37]  Cheng-Zen Yang,et al.  An Empirical Study on Improving Severity Prediction of Defect Reports Using Feature Selection , 2012, 2012 19th Asia-Pacific Software Engineering Conference.

[38]  Andreas Zeller,et al.  Where Should We Fix This Bug? A Two-Phase Recommendation Model , 2013, IEEE Transactions on Software Engineering.

[39]  David Lo,et al.  Improving Automated Bug Triaging with Specialized Topic Model , 2017, IEEE Transactions on Software Engineering.

[40]  He Liu,et al.  Multi-Document Summarization Based on Two-Level Sparse Representation Model , 2015, AAAI.

[41]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[42]  Omar Alonso,et al.  Crowdsourcing for relevance evaluation , 2008, SIGF.

[43]  He Jiang,et al.  Developer recommendation on bug commenting: a ranking approach for the developer crowd , 2017, Science China Information Sciences.

[44]  Bill Tomlinson,et al.  Who are the crowdworkers?: shifting demographics in mechanical turk , 2010, CHI Extended Abstracts.

[45]  Dragomir R. Radev,et al.  Centroid-based summarization of multiple documents , 2004, Inf. Process. Manag..

[46]  André van der Hoek,et al.  Borrowing from the Crowd: A Study of Recombination in Software Design Competitions , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[47]  Hewijin Christine Jiau,et al.  Facing up to the inequality of crowdsourced API documentation , 2012, SOEN.

[48]  Ayse Basar Bener,et al.  Evaluation of Feature Extraction Methods on Software Cost Estimation , 2007, First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007).

[49]  Xiaojin Zhu,et al.  Improving Diversity in Ranking using Absorbing Random Walks , 2007, NAACL.

[50]  Chao Liu,et al.  Data Mining for Software Engineering , 2009, Computer.

[51]  He Jiang,et al.  Mining authorship characteristics in bug repositories , 2014, Science China Information Sciences.

[52]  Senthil Mani,et al.  AUSUM: approach for unsupervised bug report summarization , 2012, SIGSOFT FSE.

[53]  Na Meng,et al.  How Does Execution Information Help with Information-Retrieval Based Bug Localization? , 2017, 2017 IEEE/ACM 25th International Conference on Program Comprehension (ICPC).

[54]  David Lo,et al.  Automated Bug Report Field Reassignment and Refinement Prediction , 2016, IEEE Transactions on Reliability.

[55]  Roger S. Pressman,et al.  Software Engineering: A Practitioner's Approach , 1982 .

[56]  Krzysztof Czarnecki,et al.  Modelling the ‘hurried’ bug report reading process to summarize bug reports , 2012, 2012 28th IEEE International Conference on Software Maintenance (ICSM).

[57]  Tao Xie,et al.  Software intelligence: the future of mining software engineering data , 2010, FoSER '10.

[58]  Martin Monperrus,et al.  Nopol: Automatic Repair of Conditional Statement Bugs in Java Programs , 2018, IEEE Transactions on Software Engineering.

[59]  Xin Chen,et al.  Automated quality assessment for crowdsourced test reports of mobile applications , 2018, 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[60]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[61]  Daniele Quercia,et al.  StakeSource: harnessing the power of crowdsourcing and social networks in stakeholder analysis , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[62]  Anette Hulth,et al.  Improved Automatic Keyword Extraction Given More Linguistic Knowledge , 2003, EMNLP.

[63]  Thomas Zimmermann,et al.  What Makes a Good Bug Report? , 2008, IEEE Transactions on Software Engineering.

[64]  Baowen Xu,et al.  Test report prioritization to assist crowdsourced testing , 2015, ESEC/SIGSOFT FSE.

[65]  Ning Chen,et al.  Puzzle-based automatic testing: bringing humans into the loop by solving puzzles , 2012, 2012 Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering.

[66]  Norman M. Sadeh,et al.  Expectation and purpose: understanding users' mental models of mobile app privacy through crowdsourcing , 2012, UbiComp.

[67]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[68]  Mark Micallef,et al.  Leveraging P2P networks to address the test scenario explosion problem , 2014, ICSE Companion.