论文信息 - Using Knowledge Transfer and Rough Set to Predict the Severity of Android Test Reports via Text Mining

Using Knowledge Transfer and Rough Set to Predict the Severity of Android Test Reports via Text Mining

Crowdsourcing is an appealing and economic solution to software application testing because of its ability to reach a large international audience. Meanwhile, crowdsourced testing could have brought a lot of bug reports. Thus, in crowdsourced software testing, the inspection of a large number of test reports is an enormous but essential software maintenance task. Therefore, automatic prediction of the severity of crowdsourced test reports is important because of their high numbers and large proportion of noise. Most existing approaches to this problem utilize supervised machine learning techniques, which often require users to manually label a large number of training data. However, Android test reports are not labeled with their severity level, and manual labeling is time-consuming and labor-intensive. To address the above problems, we propose a Knowledge Transfer Classification (KTC) approach based on text mining and machine learning methods to predict the severity of test reports. Our approach obtains training data from bug repositories and uses knowledge transfer to predict the severity of Android test reports. In addition, our approach uses an Importance Degree Reduction (IDR) strategy based on rough set to extract characteristic keywords to obtain more accurate reduction results. The results of several experiments indicate that our approach is beneficial for predicting the severity of android test reports.

[1] Song Wang,et al. FixerCache: unsupervised caching active developers for diverse bug triage , 2014, ESEM '14.

[2] Yu Zhou,et al. Combining text mining and data mining for bug report classification , 2016, J. Softw. Evol. Process..

[3] Claes Wohlin,et al. Experimentation in Software Engineering , 2000, The Kluwer International Series in Software Engineering.

[4] Tim Menzies,et al. Automated severity assessment of software defect reports , 2008, 2008 IEEE International Conference on Software Maintenance.

[5] Matthew Lease,et al. Crowdsourcing for Usability Testing , 2012, ASIST.

[6] Thomas Zimmermann,et al. What Makes a Good Bug Report? , 2008, IEEE Transactions on Software Engineering.

[7] Per Runeson,et al. Detection of Duplicate Defect Reports Using Natural Language Processing , 2007, 29th International Conference on Software Engineering (ICSE'07).

[8] Mark Harman,et al. Developer Recommendation for Crowdsourced Software Development Tasks , 2015, 2015 IEEE Symposium on Service-Oriented System Engineering.

[9] Jinmao Wei,et al. Rough set based decision tree , 2002, Proceedings of the 4th World Congress on Intelligent Control and Automation (Cat. No.02EX527).

[10] Janusz Zalewski,et al. Rough sets: Theoretical aspects of reasoning about data , 1996 .

[11] Z. Pawlak. Rough Sets: Theoretical Aspects of Reasoning about Data , 1991 .

[12] Anne Kao,et al. Natural Language Processing and Text Mining , 2006 .

[13] Baowen Xu,et al. Test report prioritization to assist crowdsourced testing , 2015, ESEC/SIGSOFT FSE.

[14] Fang Wu,et al. Predicting Defect Priority Based on Neural Networks , 2010, ADMA.

[15] Siau-Cheng Khoo,et al. A discriminative model approach for accurate duplicate bug report retrieval , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[16] Moira C. Norrie,et al. Crowdsourced Web Site Evaluation with CrowdStudy , 2012, ICWE.

[17] Serge Demeyer,et al. Comparing Mining Algorithms for Predicting the Severity of a Reported Bug , 2011, 2011 15th European Conference on Software Maintenance and Reengineering.

[18] Christian Bird,et al. Leveraging the Crowd: How 48,000 Users Helped Improve Lync Performance , 2013, IEEE Software.

[19] Westley Weimer,et al. Modeling bug report quality , 2007, ASE '07.

[20] Foutse Khomh,et al. Is it a bug or an enhancement?: a text-based approach to classify change requests , 2008, CASCON '08.

[21] Cheng G. Weng,et al. A New Evaluation Measure for Imbalanced Datasets , 2008, AusDM.

[22] Song Wang,et al. Local-based active classification of test report to assist crowdsourced testing , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[23] Brad A. Myers,et al. A Linguistic Analysis of How People Describe Software Problems , 2006, Visual Languages and Human-Centric Computing (VL/HCC'06).

[24] Tao Xie,et al. An approach to detecting duplicate bug reports using natural language and execution information , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[25] Rouvoy Romain,et al. Reproducing Context-Sensitive Crashes of Mobile Apps Using Crowdsourced Monitoring , 2016, 2016 IEEE/ACM International Conference on Mobile Software Engineering and Systems (MOBILESoft).

[26] Ingo Scholtes,et al. Categorizing bugs with social networks: A case study on four open source software communities , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[27] David Lo,et al. DRONE: Predicting Priority of Reported Bugs by Multi-factor Analysis , 2013, ICSM.

[28] Mark Harman,et al. A survey of the use of crowdsourcing in software engineering , 2017, J. Syst. Softw..

[29] Yang Feng,et al. Multi-objective test report prioritization using image understanding , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[30] Wang Guo. Calculation Methods for Core Attributes of Decision Table , 2003 .

[31] Charles X. Ling,et al. AUC: A Better Measure than Accuracy in Comparing Learning Algorithms , 2003, Canadian Conference on AI.

[32] Johan A. Pouwelse,et al. Crowdsourcing GUI Tests , 2013, 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation.

[33] Talles M. G. de A. Barbosa,et al. Affective Crowdsourcing Applied to Usability Testing , 2014 .

[34] Daniel M. Germán,et al. Towards a simplification of the bug report form in eclipse , 2008, MSR '08.

[35] Martin F. Porter,et al. An algorithm for suffix stripping , 1997, Program.

[36] Gordon Fraser,et al. CrowdOracles: Can the Crowd Solve the Oracle Problem? , 2013, 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation.

[37] Song Wang,et al. Towards Effectively Test Report Classification to Assist Crowdsourced Testing , 2016, ESEM.

[38] Siau-Cheng Khoo,et al. Towards more accurate retrieval of duplicate bug reports , 2011, 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011).

[39] Ning Chen,et al. Puzzle-based automatic testing: bringing humans into the loop by solving puzzles , 2012, 2012 Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering.