An NLP-based Tool for Software Artifacts Analysis

Software developers rely on various repositories and communication channels to exchange relevant information about their ongoing tasks and the status of overall project progress. In this context, semi-structured and unstructured software artifacts have been leveraged by researchers to build recommender systems aimed at supporting developers in different tasks, such as transforming user feedback in maintenance and evolution tasks, suggesting experts, or generating software documentation. More specifically, Natural Language (NL) parsing techniques have been successfully leveraged to automatically identify (or extract) the relevant information embedded in unstructured software artifacts. However, such techniques require the manual identification of patterns to be used for classification purposes. To reduce such a manual effort, we propose an NL parsing-based tool for software artifacts analysis named NEON that can automate the mining of such rules, minimizing the manual effort of developers and researchers. Through a small study involving human subjects with NL processing and parsing expertise, we assess the performance of NEON in identifying rules useful to classify app reviews for software maintenance purposes. Our results show that more than one-third of the rules inferred by NEON are relevant for the proposed task. Demo webpage: https://github.com/adisorbo/NEON_tool

[1]  Gerardo Canfora,et al.  Predicting issue types on GitHub , 2021, Sci. Comput. Program..

[2]  Oscar Nierstrasz,et al.  How to Identify Class Comment Types? A Multi-language Approach for Class Comment Classification , 2021, J. Syst. Softw..

[3]  Corrado Aaron Visaggio,et al.  Investigating the criticality of user‐reported issues through their relations with app rating , 2020, J. Softw. Evol. Process..

[4]  Harald C. Gall,et al.  Exploiting Natural Language Structures in Software Informal Documentation , 2019, IEEE Transactions on Software Engineering.

[5]  Chris Parnin,et al.  Understanding the impact of GitHub suggested changes on recommendations between developers , 2020, ESEC/SIGSOFT FSE.

[6]  David Lo,et al.  Automating Intention Mining , 2020, IEEE Transactions on Software Engineering.

[7]  Gerardo Canfora,et al.  Summarizing vulnerabilities' descriptions to support experts during vulnerability assessment activities , 2019, J. Syst. Softw..

[8]  Jan Ole Johanssen,et al.  REACT: An Approach for Capturing Rationale in Chat Messages , 2017, 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM).

[9]  Gabriele Bavota,et al.  Detecting missing information in bug descriptions , 2017, ESEC/SIGSOFT FSE.

[10]  Yu Zhou,et al.  Analyzing APIs Documentation and Code to Detect Directive Defects , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[11]  Harald C. Gall,et al.  What would users change in my app? summarizing app reviews for recommending software changes , 2016, SIGSOFT FSE.

[12]  Harald C. Gall,et al.  DECA: Development Emails Content Analyzer , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C).

[13]  Harald C. Gall,et al.  Development Emails Content Analyzer: Intention Mining in Developer Discussions (T) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[14]  Harald C. Gall,et al.  How can i improve my app? Classifying user reviews for software maintenance and evolution , 2015, 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[15]  Gabriele Bavota,et al.  How Developers' Collaborations Identified from Different Sources Tell Us about Code Changes , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[16]  Gerardo Canfora,et al.  CODES: mining source code descriptions from developers discussions , 2014, ICPC 2014.

[17]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[18]  David Lo,et al.  An empirical study on developer interactions in StackOverflow , 2013, SAC '13.

[19]  Gerardo Canfora,et al.  Who is going to mentor newcomers in open source projects? , 2012, SIGSOFT FSE.

[20]  Foutse Khomh,et al.  Is it a bug or an enhancement?: a text-based approach to classify change requests , 2008, CASCON '08.

[21]  Christopher D. Manning,et al.  The Stanford Typed Dependencies Representation , 2008, CF+CDPE@COLING.

[22]  Andrew Begel,et al.  Global Software Development: Who Does It? , 2008, 2008 IEEE International Conference on Global Software Engineering.

[23]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[24]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[25]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.