Success prediction of android applications in a novel repository using neural networks

Nowadays, Android applications play a major role in software industry. Therefore, having a system that can help companies predict the success probability of such applications would be useful. Thus far, numerous research works have been conducted to predict the success probability of desktop applications using a variety of machine learning techniques. However, since features of desktop programs are different from those of mobile applications, they are not applicable to mobile applications. To our knowledge, there has not been a repository or even a method to predict the success probability of Android applications so far. In this research, we introduce a repository composed of 100 successful and 100 unsuccessful apps of Android operating system in Google PlayStore TM including 34 features per application. Then, we use the repository to a neural network and other classification algorithms to predict the success probability. Finally, we compare the proposed method with the previous approaches based on the accuracy criterion. Experimental results show that the best accuracy which we achieved is 99.99%, which obtained when we used MLP and PCA, while the best accuracy achieved by the previous work in desktop platforms was 96%. However, the time complexity of the proposed approach is higher than previous methods, since the time complexities of NPR and MLP are O $$( n^3$$ ( n 3 ) and O $$( nph^koi$$ ( n p h k o i ), respectively.

[1]  T. Punitha,et al.  Mooshak A Valuable Repository of Codes , 2008, 2008 Eighth IEEE International Conference on Advanced Learning Technologies.

[2]  Jared Smith,et al.  A Dataset of Open-Source Android Applications , 2015, 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories.

[3]  Pao-Ann Hsiung,et al.  XML-Based Reusable Component Repository for Embedded Software , 2011, 2011 IEEE 35th Annual Computer Software and Applications Conference Workshops.

[4]  Tong-Seng Quah,et al.  Application of neural network for predicting software development faults using object-oriented design metrics , 2002, Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP '02..

[5]  Arturo González-Escribano,et al.  The OpenMP source code repository , 2005, 13th Euromicro Conference on Parallel, Distributed and Network-Based Processing.

[6]  L. Spirkovska,et al.  Rapid training of higher-order neural networks for invariant pattern recognition , 1989, International 1989 Joint Conference on Neural Networks.

[7]  Sotiris B. Kotsiantis,et al.  Decision trees: a recent overview , 2011, Artificial Intelligence Review.

[8]  Joel Ossher,et al.  Sourcerer: An internet-scale software repository , 2009, 2009 ICSE Workshop on Search-Driven Development-Users, Infrastructure, Tools and Evaluation.

[9]  Jean-Jacques Gras,et al.  Improving fault prediction using Bayesian networks for the development of embedded software applications , 2006, Softw. Test. Verification Reliab..

[10]  Bart Baesens,et al.  Toward Comprehensible Software Fault Prediction Models Using Bayesian Network Classifiers , 2013, IEEE Transactions on Software Engineering.

[11]  Olcay Taner Yildiz,et al.  Software defect prediction using Bayesian networks , 2012, Empirical Software Engineering.

[12]  Tiyun Huang,et al.  A Framework of Estimating Software Project Success Potential Based on Association Rule Mining , 2009, 2009 International Conference on Management and Service Science.

[13]  Jing-zhou Zhang,et al.  Design and Implementation of RAS-Based Open Source Software Repository , 2009, 2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery.

[14]  Slobodan Vucetic,et al.  Learning Vector Quantization with adaptive prototype addition and removal , 2009, 2009 International Joint Conference on Neural Networks.

[15]  Raúl Rojas,et al.  Neural Networks - A Systematic Introduction , 1996 .

[16]  Hassan Abolhassani,et al.  A survey of dynamic software updating , 2013, J. Softw. Evol. Process..

[17]  Yanxia Zhang,et al.  k-Nearest Neighbors for automated classification of celestial objects , 2008 .

[18]  James S. Wasek,et al.  Improving software project outcomes through predictive analytics: Part 1 , 2015, IEEE Engineering Management Review.

[19]  Sujan Chowdhury,et al.  A new source code repository for dynamic storing, browsing, and retrieval of source codes , 2013, 2013 International Conference on Informatics, Electronics and Vision (ICIEV).

[20]  Min-Yuan Cheng,et al.  Dynamic Prediction of Project Success Using Artificial Intelligence , 2007 .

[21]  Min-Yuan Cheng,et al.  Dynamic Prediction of Project Success Using Evolutionary Support Vector Machine Inference Model , 2008 .

[22]  Suma,et al.  An Approach to Predict Software Project Success by Data Mining Clustering , 2012 .

[23]  V. Suma,et al.  An Approach to Predict Software Project Success Based on Random Forest Classifier , 2014 .

[24]  Cheng-Lung Huang,et al.  A GA-based feature selection and parameters optimizationfor support vector machines , 2006, Expert Syst. Appl..

[25]  Jing Cheng,et al.  Network User Interest Pattern Mining Based on Entropy Clustering Algorithm , 2015, 2015 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery.

[26]  T. McAvoy,et al.  Nonlinear principal component analysis—Based on principal curves and neural networks , 1996 .

[27]  William Marsh,et al.  Predicting software defects in varying development lifecycles using Bayesian nets , 2007, Inf. Softw. Technol..

[28]  Michael Y. Hu,et al.  Forecasting with artificial neural networks: The state of the art , 1997 .

[29]  Michael Biehl,et al.  Dynamics and Generalization Ability of LVQ Algorithms , 2007, J. Mach. Learn. Res..

[30]  Alfredo Candia-Véjar,et al.  The optimization of success probability for software projects using genetic algorithms , 2011, J. Syst. Softw..

[31]  Antonello Pasini,et al.  Artificial neural networks for small dataset analysis. , 2015, Journal of thoracic disease.

[32]  Daniel Lucrédio,et al.  Specification, Design and Implementation of a Reuse Repository , 2007, 31st Annual International Computer Software and Applications Conference (COMPSAC 2007).

[33]  M. Mukaka,et al.  Statistics corner: A guide to appropriate use of correlation coefficient in medical research. , 2012, Malawi medical journal : the journal of Medical Association of Malawi.

[34]  Carlos E. Pedreira,et al.  Neural networks for short-term load forecasting: a review and evaluation , 2001 .

[35]  Jihyun Lee,et al.  Facilitating reuse of software components using repository technology , 2003, Tenth Asia-Pacific Software Engineering Conference, 2003..

[36]  Sousuke Amasaki,et al.  A Bayesian belief network for assessing the likelihood of fault content , 2003, 14th International Symposium on Software Reliability Engineering, 2003. ISSRE 2003..