Analytics: Key to Go from Generating Big Data to Deriving Business Value

The potential to extract actionable insights from Big Data has gained increased attention of researchers in academia as well as several industrial sectors. The field has become interesting and problems look even more exciting to solve ever since organizations have been trying to tame large volumes of complex and fast arriving Big Data streams through newer computing paradigms. However, extracting meaningful and actionable information from Big Data is a challenging and daunting task. The ability to generate value from large volumes of data is an art which combined with analytical skills needs to be mastered in order to gain competitive advantage in business. The ability of organizations to leverage the emerging technologies and integrate Big Data into their enterprise architectures effectively depends on the maturity level of the technology and business teams, capabilities they develop as well as the strategies they adopt. In this paper, through selected use cases, we demonstrate how statistical analyses, machine learning algorithms, optimization and text mining algorithms can be applied to extract meaningful insights from the data available through social media, online commerce, telecommunication industry, smart utility meters and used for variety of business benefits, including improving security. The nature of applied analytical techniques largely depends on the underlying nature of the problem so a one-size-fits-all solution hardly exists. Deriving information from Big Data is also subject to challenges associated with data security and privacy. These and other challenges are discussed in context of the selected problems to illustrate the potential of Big Data analytics.

[1]  Aboul Ella Hassanien,et al.  Multi-layer hybrid machine learning techniques for anomalies detection and classification approach , 2013, 13th International Conference on Hybrid Intelligent Systems (HIS 2013).

[2]  Sieh Kiong Tiong,et al.  Electrical Power Load Forecasting using Hybrid Self-Organizing Maps and Support Vector Machines , 2008 .

[3]  Maria Rosario Mestre,et al.  Tracking of consumer behaviour in e-commerce , 2013, Proceedings of the 16th International Conference on Information Fusion.

[4]  Bhavani M. Thuraisingham,et al.  A new intrusion detection system using support vector machines and hierarchical clustering , 2007, The VLDB Journal.

[5]  Chih-Ping Wei,et al.  Turning telecommunications call details to churn prediction: a data mining approach , 2002, Expert Syst. Appl..

[6]  P. K. Panigrahi,et al.  A Comparative Study of Supervised Machine Learning Techniques for Spam E-mail Filtering , 2012, 2012 Fourth International Conference on Computational Intelligence and Communication Networks.

[7]  Maria das Graças Volpe Nunes,et al.  NILC_USP: An Improved Hybrid System for Sentiment Analysis in Twitter Messages , 2014, *SEMEVAL.

[8]  Mohd Faizal Abdollah,et al.  Analysis of Features Selection and Machine Learning Classifier in Android Malware Detection , 2014, 2014 International Conference on Information Science & Applications (ICISA).

[9]  Tariq Mahmood,et al.  Security Analytics: Big Data Analytics for cybersecurity: A review of trends, techniques and tools , 2013, 2013 2nd National Conference on Information Assurance (NCIA).

[10]  Joby James,et al.  Detection of phishing URLs using machine learning techniques , 2013, 2013 International Conference on Control Communication and Computing (ICCC).

[11]  S. Mercy Shalinie,et al.  Detection of DDoS attacks using Enhanced Support Vector Machines with real time generated dataset , 2011, 2011 Third International Conference on Advanced Computing.

[12]  Yonggang Wen,et al.  Toward Scalable Systems for Big Data Analytics: A Technology Tutorial , 2014, IEEE Access.

[13]  Bing Xu,et al.  A personalized products selection assistance based on e-commerce machine learning , 2004, Proceedings of 2004 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.04EX826).

[14]  Wenbo Zhang,et al.  Improved K-Means cluster algorithm in telecommunications enterprises customer segmentation , 2010, 2010 IEEE International Conference on Information Theory and Information Security.

[15]  A. B. M. Shawkat Ali,et al.  Trust Issues that Create Threats for Cyber Attacks in Cloud Computing , 2011, 2011 IEEE 17th International Conference on Parallel and Distributed Systems.

[16]  Hong Peng,et al.  A stepwise learning approach to automatic discovery of interest data blocks , 2004, Proceedings of 2004 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.04EX826).

[17]  Fei Liu,et al.  A clustering-based approach on sentiment analysis , 2010, 2010 IEEE International Conference on Intelligent Systems and Knowledge Engineering.

[18]  Xinghuo Yu,et al.  Advanced analytics for harnessing the power of smart meter big data , 2013, 2013 IEEE International Workshop on Inteligent Energy Systems (IWIES).

[19]  Silvia Santini,et al.  Revealing Household Characteristics from Smart Meter Data , 2014 .

[20]  C. Senabre,et al.  Methods for customer and demand response policies selection in new electricity markets , 2007 .

[21]  Alvaro A. Cárdenas,et al.  Big Data Analytics for Security , 2013, IEEE Security & Privacy.

[22]  Diana Maynard,et al.  Automatic Detection of Political Opinions in Tweets , 2011, #MSM.

[23]  Bing Liu,et al.  Sentiment Analysis and Opinion Mining , 2012, Synthesis Lectures on Human Language Technologies.

[24]  Ram Rajagopal,et al.  Smart Meter Driven Segmentation: What Your Consumption Says About You , 2013, IEEE Transactions on Power Systems.

[25]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[26]  Seref Sagiroglu,et al.  Big data: A review , 2013, 2013 International Conference on Collaboration Technologies and Systems (CTS).

[27]  Padmini Srinivasan,et al.  Exploring Feature Definition and Selection for Sentiment Classifiers , 2011, ICWSM.

[28]  Dagmar Niebur,et al.  Load profile estimation in electric transmission networks using independent component analysis , 2003 .

[29]  Yingliang Wu,et al.  Study on Knowledge Acquisition of the Telecom Customers' Consuming Behaviour Based on Data Mining , 2008, 2008 4th International Conference on Wireless Communications, Networking and Mobile Computing.

[30]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[31]  Piyush Malik,et al.  Governing Big Data: Principles and practices , 2013, IBM J. Res. Dev..

[32]  Maite Taboada,et al.  Lexicon-Based Methods for Sentiment Analysis , 2011, CL.

[33]  Sourav Mazumdar,et al.  Challenges and best practices for enterprise adoption of Big Data technologies , 2014, 2014 IEEE International Technology Management Conference.

[34]  N. Kamalraj,et al.  A Survey on Churn Prediction Techniques in Communication Sector , 2013 .

[35]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[36]  M. S. Usha,et al.  Analysis of sentiments using unsupervised learning techniques , 2013, 2013 International Conference on Information Communication and Embedded Systems (ICICES).

[37]  D. Alahakoon,et al.  Churn prediction methodologies in the telecommunications sector: A survey , 2013, 2013 International Conference on Advances in ICT for Emerging Regions (ICTer).

[38]  Charu C. Aggarwal,et al.  Mining Text Data , 2012 .

[39]  Ning Yu Xin,et al.  How we could realize big data value , 2013, 2013 2nd International Symposium on Instrumentation and Measurement, Sensor Network and Automation (IMSNA).

[40]  Anmol Rajpurohit,et al.  Big data for business managers — Bridging the gap between potential and value , 2013, 2013 IEEE International Conference on Big Data.

[41]  Liu Yi-jun,et al.  Telecom customer segmentation with K-means clustering , 2012, 2012 7th International Conference on Computer Science & Education (ICCSE).

[42]  Janusz Wielki,et al.  Implementation of the Big Data concept in organizations - possibilities, impediments and challenges , 2013, 2013 Federated Conference on Computer Science and Information Systems.

[43]  Tianyi Jiang,et al.  Segmenting Customers from Population to Individuals: Does 1-to-1 Keep Your Customers Forever? , 2006, IEEE Transactions on Knowledge and Data Engineering.

[44]  Lei Zhang,et al.  Combining lexicon-based and learning-based methods for twitter sentiment analysis , 2011 .

[45]  Parag Singla,et al.  Characterizing comparison shopping behavior: A case study , 2014, 2014 IEEE 30th International Conference on Data Engineering Workshops.

[46]  Ping Yang,et al.  A Sketch of Big Data Technologies , 2013, 2013 Seventh International Conference on Internet Computing for Engineering and Science.

[47]  Rajiv M. Dewan,et al.  Using Online Competitor's Inventory Information for Pricing , 2007, 2007 40th Annual Hawaii International Conference on System Sciences (HICSS'07).

[48]  D.S. Yeung,et al.  Denial of service detection by support vector machines and radial-basis function neural network , 2004, Proceedings of 2004 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.04EX826).

[49]  Avita Katal,et al.  Big data: Issues, challenges, tools and Good practices , 2013, 2013 Sixth International Conference on Contemporary Computing (IC3).

[50]  P HaseenaRahmath,et al.  Opinion Mining and Sentiment Analysis - Challenges and Applications , 2014 .

[51]  Feng Zhao,et al.  A real-time intelligent abnormity diagnosis platform in electric power system , 2014, 16th International Conference on Advanced Communication Technology.

[52]  Hui Luo,et al.  WiFi: what's next? , 2002, IEEE Commun. Mag..

[53]  R. Rajasree,et al.  Sentiment analysis in twitter using machine learning techniques , 2013, 2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT).

[54]  Saifee Vohra,et al.  Applications and Challenges for Sentiment Analysis : A Survey , 2013 .

[55]  Walaa Medhat,et al.  Sentiment analysis algorithms and applications: A survey , 2014 .

[56]  Canan Girgin,et al.  Business model canvas perspective on big data applications , 2013, 2013 IEEE International Conference on Big Data.

[57]  B. B. Gupta,et al.  A Survey of Phishing Email Filtering Techniques , 2013, IEEE Communications Surveys & Tutorials.

[58]  Marie-Francine Moens,et al.  Automatic Sentiment Analysis in On-line Text , 2007, ELPUB.

[59]  Bhavani M. Thuraisingham Data mining for security applications , 2004, ICMLA.