Big Data Mining: An Overview

Big data is an evolving term that describes any voluminous amount of structured, semi-structured and unstructured data that has the potential to be mined for information. Although big data doesn't refer to any specific quantity, the term is often used when speaking about petabytes and exabytes of data. Big Data mining is the capability of extracting useful information from these large datasets or streams of data, that due to its volume, variability, and velocity, it was not possible before to do it. An example of big data might be petabytes (1,024 terabytes) or exabytes (1,024 petabytes) of data consisting of billions to trillions of records of millions of people—all from different sources (e.g. Web, sales, customer contact centre, social media, mobile data and so on). The data is typically loosely structured data that is often incomplete and inaccessible. Data mining is primarily used today by companies with a strong consumer focus - retail, financial, communication, and marketing organizations. It enables these companies to determine relationships among "internal" factors such as price, product positioning, or staff skills, and "external" factors such as economic indicators, competition, and customer demographics. And, it enables them to determine the impact on sales, customer satisfaction, and corporate profits. Finally, it enables them to "drill down" into summary information to view detail transactional data. Here we present a HACE theorem that characterizes the features of the Big Data revolution.

[1]  David J. Leinweber,et al.  Stupid Data Miner Tricks , 2007 .

[2]  Xindong Wu,et al.  Building Intelligent Learning Database Systems , 2000, AI Mag..

[3]  George Duncan,et al.  Privacy By Design , 2007, Science.

[4]  George Karypis,et al.  Algorithms for mining the evolution of conserved relational states in dynamic networks , 2011, 2011 IEEE 11th International Conference on Data Mining.

[5]  SangKeun Lee,et al.  Novel approaches to crawling important pages early , 2012, Knowledge and Information Systems.

[6]  Nitin Agarwal,et al.  Analyzing collective behavior from blogs using swarm intelligence , 2012, Knowledge and Information Systems.

[7]  Xindong Wu,et al.  Data mining with big data , 2014, IEEE Transactions on Knowledge and Data Engineering.

[8]  E. Birney The making of ENCODE: Lessons for big-data projects , 2012, Nature.