Real-time opinion mining of Twitter data using spring XD and Hadoop

Big data have emerged as one of the fascinating areas of research in the last few years. The demand for research shows that it will further grow in the next years to come. Big data mainly came into existence because of the rapid growth of social media. Twitter has appeared to be the one of the most popular social media over the Internet. Twitter receives tens of millions of tweets per day, creating huge data in unstructured form, a lot of research has been carried out to extract useful information from Twitter raw data. It also exhibits sentiment of the people on specific topics. However, this huge data repository is unstructured and offers itself for many research areas. A number of researchers have attempted to extract useful information from this unstructured data for various applications. This paper presents a framework to visualize raw tweets in a scalable and optimal fashion. The main objective of the research work is to get sentiment of the people and visualize it for better understanding. Spring XD has been used to fetch tweets on a real time basis. These raw tweets are then transformed to Hadoop Distributed File System (HDFS). Hadoop Scripting Language (HIVE) is used to refine and label the tweets for their respective sentiments. Finally, these sentiments are classified as positive, negative and neutral using an algorithm which is simulated over HIVE. The proposedd algorithm yields better results in term of sentiment.