Characterizing Common and Domain-Specific Package Bugs: A Case Study on Ubuntu

Ubuntu is an open source software platform that runs everywhere from the smartphone, the tablet and the PC to the server and the cloud. In Ubuntu, there are many self-contained or third-party software packages for different use, and a bug report in Ubuntu could affect one or more packages simultaneously. Identifying the common package bugs in Ubuntu can help both developers and users better understand the packages they are developing or using, and also provide further guidelines to developers of similar packages in the future. In this paper, we perform a large-scale empirical study of common package bugs on Ubuntu by leveraging topic modeling. By analyzing a total of 240,097 bug reports, we identify 3 general bugs that are common to all Ubuntu packages, i.e., Graphical User Interface (GUI), Maintenance, and Runtime bugs. Moreover, we categorize top-100 packages with most number of bug reports into 6 categories (i.e., graphics, internet, office, sound and video, system management, and kernel), and identify domain-specific bugs for each category.

[1]  Jacques Klein,et al.  Characterizing malicious Android apps by mining topic-specific data flow signatures , 2017, Inf. Softw. Technol..

[2]  DingYing,et al.  Improving Automated Bug Triaging with Specialized Topic Model , 2017 .

[3]  Shrish Verma,et al.  Generating Intelligent Summary Terms for Improving Knowledge Discovery in Software Bug Repositories , 2016, Int. J. Softw. Eng. Knowl. Eng..

[4]  David Lo,et al.  Which Packages Would be Affected by This Bug Report? , 2017, 2017 IEEE 28th International Symposium on Software Reliability Engineering (ISSRE).

[5]  Ashish Sureka,et al.  Linguistic analysis of bug report titles with respect to the dimension of bug importance , 2010, Bangalore Compute Conf..

[6]  Tao Zhang,et al.  A Novel Developer Ranking Algorithm for Automatic Bug Triage Using Topic Model and Developer Relations , 2014, 2014 21st Asia-Pacific Software Engineering Conference.

[7]  W. Bruce Croft,et al.  LDA-based document models for ad-hoc retrieval , 2006, SIGIR.

[8]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[9]  Dane Bertram,et al.  Communication, collaboration, and bugs: the social nature of issue tracking in small, collocated teams , 2010, CSCW '10.

[10]  Gail C. Murphy,et al.  Automatic Summarization of Bug Reports , 2014, IEEE Transactions on Software Engineering.

[11]  Xinli Yang,et al.  What Security Questions Do Developers Ask? A Large-Scale Study of Stack Overflow Posts , 2016, Journal of Computer Science and Technology.

[12]  Thomas Zimmermann,et al.  What Makes a Good Bug Report? , 2010, IEEE Trans. Software Eng..

[13]  Xiaobing Sun,et al.  Enhancing developer recommendation with supplementary information via mining historical commits , 2017, J. Syst. Softw..

[14]  Shuai Lu,et al.  Summarizing Source Code with Transferred API Knowledge , 2018, IJCAI.

[15]  Peter C. Rigby,et al.  Leveraging Informal Documentation to Summarize Classes and Methods in Context , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[16]  David Lo,et al.  Improving Automated Bug Triaging with Specialized Topic Model , 2017, IEEE Transactions on Software Engineering.

[17]  David B. Dunson,et al.  Probabilistic topic models , 2012, Commun. ACM.

[18]  Xiaodong Gu,et al.  "What Parts of Your Apps are Loved by Users?" (T) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[19]  Tao Zhang,et al.  Towards Semi-automatic Bug Triage and Severity Prediction Based on Topic Model and Multi-feature of Bug Reports , 2014, 2014 IEEE 38th Annual Computer Software and Applications Conference.

[20]  Krzysztof Czarnecki,et al.  Modelling the ‘hurried’ bug report reading process to summarize bug reports , 2012, 2012 28th IEEE International Conference on Software Maintenance (ICSM).

[21]  Lori L. Pollock,et al.  Automatic generation of natural language summaries for Java classes , 2013, 2013 21st International Conference on Program Comprehension (ICPC).

[22]  David Lo,et al.  Automated prediction of bug report priority using multi-factor analysis , 2014, Empirical Software Engineering.

[23]  David Lo,et al.  Deep Code Comment Generation , 2018, 2018 IEEE/ACM 26th International Conference on Program Comprehension (ICPC).

[24]  Zhenchang Xing,et al.  AnswerBot: Automated generation of answer summary to developers' technical questions , 2017, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[25]  Aneesh Krishna,et al.  Duplicate Bug Report Detection Using Clustering , 2014, 2014 23rd Australian Software Engineering Conference.