When Will You Have a New Mobile Phone? An Empirical Answer From Big Data

When and why people change their mobile phones are important issues in mobile communications industry, because it will impact greatly on the marketing strategy and revenue estimation for both mobile operators and manufactures. It is a promising way to take use of big data to analyze and predict the phone changing event. In this paper, based on mobile user big data, first through statistical analysis, we find that three important probability distributions, i.e., power-law, log-normal, and geometric distribution, play an important role in the user behaviors. Second, the relationships between eight selected attributes and phone changing are built, for example, young people have greater intention to change their phones if they are using the phones belonging to the low occupancy phones or feature phones. Third, we verified the performance of four prediction models on phone changing event under three scenarios. Information gain ratio was used to implement attribute selection and then sampling method, cost-sensitive together with standard classifiers were used to solve imbalanced phone changing event. Experiment results show our proposed enhanced backpropagation neural network in the undersampling scenario can attain better prediction performance.

[1]  M. Tahar Kechadi,et al.  Customer churn prediction in telecommunications , 2012, Expert Syst. Appl..

[2]  Lance Chun Che Fung,et al.  Classification of Imbalanced Data by Combining the Complementary Neural Network and SMOTE Algorithm , 2010, ICONIP.

[3]  D. Bates,et al.  Big data in health care: using analytics to identify and manage high-risk and high-cost patients. , 2014, Health affairs.

[4]  David A. Cieslak,et al.  Automatically countering imbalance and its empirical relationship to cost , 2008, Data Mining and Knowledge Discovery.

[5]  Zhaohui Wu,et al.  Mining User Attributes Using Large-Scale APP Lists of Smartphones , 2017, IEEE Systems Journal.

[6]  Peng Jun Huang,et al.  Classification of Imbalanced Data Using Synthetic Over-Sampling Techniques , 2015 .

[7]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[8]  Ming Yang,et al.  Large-scale image classification: Fast feature extraction and SVM training , 2011, CVPR 2011.

[9]  Wei Hu,et al.  AdaBoost-Based Algorithm for Network Intrusion Detection , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[10]  Yilin Zhao,et al.  Mobile phone location determination and its impact on intelligent transportation systems , 2000, IEEE Trans. Intell. Transp. Syst..

[11]  Dirk Van den Poel,et al.  Handling class imbalance in customer churn prediction , 2009, Expert Syst. Appl..

[12]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[13]  Yonggang Wen,et al.  Toward Scalable Systems for Big Data Analytics: A Technology Tutorial , 2014, IEEE Access.

[14]  Lars Schmidt-Thieme,et al.  Cost-sensitive learning methods for imbalanced data , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[15]  Senén Barro,et al.  Do we need hundreds of classifiers to solve real world classification problems? , 2014, J. Mach. Learn. Res..

[16]  Eric W. T. Ngai,et al.  Customer churn prediction using improved balanced random forests , 2009, Expert Syst. Appl..

[17]  Peter Groves,et al.  The 'big data' revolution in healthcare: Accelerating value and innovation , 2016 .

[18]  Meiko Jensen Challenges of Privacy Protection in Big Data Analytics , 2013, 2013 IEEE International Congress on Big Data.

[19]  中華人民共和国国家統計局 中华人民共和国国民经济和社会发展统计公报 = Statistical communique of The People's Republic of China on the national economic and social development , 2008 .

[20]  Christina Park Consumers and Mobile Financial Services , 2016 .

[21]  Charles X. Ling,et al.  Using AUC and accuracy in evaluating learning algorithms , 2005, IEEE Transactions on Knowledge and Data Engineering.

[22]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[23]  Martin Ester,et al.  CRIMETRACER: Activity space based crime location prediction , 2014, 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014).

[24]  T. Lobos,et al.  Automated classification of power-quality disturbances using SVM and RBF networks , 2006, IEEE Transactions on Power Delivery.

[25]  Shui Yu,et al.  Big Privacy: Challenges and Opportunities of Privacy Study in the Age of Big Data , 2016, IEEE Access.

[26]  You Zhou,et al.  Understanding How Users Change their Mobile Phones by Massive Data Analysis , 2015, 2015 7th International Conference on Intelligent Human-Machine Systems and Cybernetics.

[27]  Sanming Zhou,et al.  Networking for Big Data: A Survey , 2017, IEEE Communications Surveys & Tutorials.

[28]  Katarzyna Wac,et al.  Individuals’ Stress Assessment Using Human-Smartphone Interaction Analysis , 2018, IEEE Transactions on Affective Computing.

[29]  Jianhua Dai,et al.  Attribute selection based on information gain ratio in fuzzy rough set theory with application to tumor classification , 2013, Appl. Soft Comput..

[30]  Yanchun Zhang,et al.  AdaBoost algorithm with random forests for predicting breast cancer survivability , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[31]  Veda C. Storey,et al.  Business Intelligence and Analytics: From Big Data to Big Impact , 2012, MIS Q..

[32]  V. A. Nasir,et al.  Discovering behavioral segments in the mobile phone market , 2010 .

[33]  Jing Bian,et al.  An Efficient Cost-Sensitive Feature Selection Using Chaos Genetic Algorithm for Class Imbalance Problem , 2016 .

[34]  G. Shukur,et al.  On Ridge Parameters in Logistic Regression , 2011 .

[35]  José Salvador Sánchez,et al.  On the effectiveness of preprocessing methods when dealing with different levels of class imbalance , 2012, Knowl. Based Syst..

[36]  Seong-hun Park,et al.  Large Imbalance Data Classification Based on MapReduce for Traffic Accident Prediction , 2014, 2014 Eighth International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing.

[37]  Kate Smith-Miles,et al.  On learning algorithm selection for classification , 2006, Appl. Soft Comput..