Real World Applications of Machine Learning Techniques over Large Mobile Subscriber Datasets

Abstract Communication Service Providers (CSPs) are in a unique position to utilize theirvast transactional data assets generated from interactions of subscribers with net-work elements as well as with other subscribers. CSPs could leverage its dataassets for a gamut of applications such as service personalization, predictive offermanagement, loyalty management, revenue forecasting, network capacity plan-ning, product bundle optimization and churn management to gain significant com-petitive advantage. However, due to the sheer data volume, variety, velocity andveracity of mobile subscriber datasets, sophisticated data analytics techniques andframeworks are necessary to derive actionable insights in a useable timeframe. Inthis paper, we describe our journey from a relational database management system(RDBMS) based campaign management solution which allowed data scientistsand marketers to use hand-written rules for service personalization and targetedpromotions to a distributed Big Data Analytics platform, capable of performinglarge scale machine learning and data mining to deliver real time service person-alization, predictive modelling and product optimization. Our work involves acareful blend of technology, processes and best practices, which facilitate man-machine collaboration and continuous experimentation to derive measurable eco-nomic value from data. Our platform has a reach of more than 500 million mobilesubscribers worldwide, delivering over 1 billion personalized recommendationsannually, processing a total data volume of 64 Petabytes, corresponding to 8.5trillion events.

[1]  Gary M. Weiss Data Mining in Telecommunications , 2005, The Data Mining and Knowledge Discovery Handbook.

[2]  Gary Weiss Data Mining in the Telecommunications Industry , 2009, Encyclopedia of Data Warehousing and Mining.

[3]  David C. Yen,et al.  Data mining techniques for customer relationship management , 2002 .

[4]  Lior Rokach,et al.  Data Mining And Knowledge Discovery Handbook , 2005 .

[5]  Hakima Chaouchi,et al.  Introduction to the Internet of Things , 2013 .

[6]  Chih-Ping Wei,et al.  Turning telecommunications call details to churn prediction: a data mining approach , 2002, Expert Syst. Appl..

[7]  張 毓騰,et al.  APPLYING DATA MINING TO TELECOM CHURN MANAGEMENT , 2009 .

[8]  A. Lee Gilbert,et al.  A marketing model for mobile wireless services , 2003, 36th Annual Hawaii International Conference on System Sciences, 2003. Proceedings of the.

[9]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[10]  Michal J. Okoniewski,et al.  Applying Data Mining Methods for Cellular Radio Network Planning , 2000, Intelligent Information Systems.

[11]  Frederic P. Miller,et al.  Internet Movie Database , 2009 .

[12]  David C. Yen,et al.  Applying data mining to telecom churn management , 2006, Expert Syst. Appl..

[13]  Kate Smith-Miles,et al.  A Comprehensive Survey of Data Mining-based Fraud Detection Research , 2010, ArXiv.

[14]  S. Daskalaki,et al.  Data mining for decision support on customer insolvency in telecommunications business , 2003, Eur. J. Oper. Res..

[15]  Cherié L. Weible,et al.  The Internet Movie Database , 2001 .

[16]  Hakima Chaouchi,et al.  The Internet of things : connecting objects to the web , 2013 .

[17]  Sean Owen,et al.  Mahout in Action , 2011 .

[18]  Santanu Chaudhury,et al.  Improving Collaborative Filtering Based Recommenders Using Topic Modelling , 2014, 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).