Analysis of dengue outbreaks using big data analytics and social networks

The term Big Data can be defined as analysis a large volume of data in unstructured databases, organizations of different segment and size, has been employing the Big Data Analytics philosophy as support tool strategic to anticipate valuable insights and trends on the behavior of consumers and their expectations, thus gaining a competitive advantage in the market in which they operate. However, to extract information from values as the goal of turning that volume of data into predictive information or insights is still a big challenge in Big Data. The main objective of this work is to present an implementation of a Big Data project, using data originated from social networks, as well as text mining techniques and machine learning, through the implementation of K-Means and SVM algorithms, with intention of identifying patterns from dengue outbreaks, through analyzes that show insights of probable outbreaks of dengue in a particular region of Brazil. The results obtained indicate that the implemented project had a satisfactory yield if compared of the data collected of the Ministry of Health of Brazil, thus indicating a potential for utilization of its purpose. It is observed that the main advantage of the analyzes in Big Data is related to the possibility of use the unstructured data that can be obtained in social networks, e-commerce sites, among others and structured data obtained from traditional databases and from this union, to extract information from values that can be used to benefit of organizations, thus allowing know the future behaviors and thereby act in a preventive way.