Finding the merits and drawbacks of software resources from comments

In order to reuse software resources efficiently, developers need necessary quality guarantee on software resources. However, our investigation proved that most software resources on the Internet did not provide enough quality descriptions. In this paper, we propose an approach to help developers judge a software resource's quality based on comments. In our approach, the software resources' comments on the Internet are automatically collected, the sentiment polarity (positive or negative) of a comment is identified and the quality aspects which the comment talks about are extracted. As a result, the merits and drawbacks of software resources are drew out which could help developers judge a software resource's quality in the process of software resource selection and reuse. To evaluate our approach, we applied our method to a group of open source software and the results showed that our method achieved satisfying precision in merits and drawbacks finding.

[1]  Witold Pedrycz,et al.  A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[2]  Qinglin Guo The Similarity Computing of Documents Based on VSM , 2008, 2008 32nd Annual IEEE International Computer Software and Applications Conference.

[3]  Barry Smyth,et al.  Fact or Fiction: Content Classification for Digital Libraries , 2001, DELOS.

[4]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[5]  Abdelwadood Mesleh,et al.  Chi Square Feature Extraction Based Svms Arabic Language Text Categorization System , 2007 .

[6]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[7]  Lijie Wang,et al.  Enriching Descriptions for Public Web Services Using Information Captured from Related Web Pages on the Internet , 2010, 2010 Fifth IEEE International Symposium on Service Oriented System Engineering.

[8]  Djoerd Hiemstra,et al.  A probabilistic justification for using tf×idf term weighting in information retrieval , 2000, International Journal on Digital Libraries.

[9]  Tang Qi Similarity computing of documents based on VSM , 2008 .

[10]  Nenad Medvidovic,et al.  A Bayesian Model for Predicting Reliability of Software Systems at the Architectural Level , 2007, QoSA.

[11]  Houari A. Sahraoui,et al.  Deviance from perfection is a better criterion than closeness to evil when identifying risky code , 2010, ASE.

[12]  Shari Lawrence Pfleeger,et al.  Evaluating software engineering standards , 1994, Computer.