Classification techniques for big data: A survey

Big Data is an immense term for working with large volume and complex data sets. When data set is large in volume and traditional processing applications are inadequate then distributed databases are needed. Big data came into existence because earlier technologies were not able to handle such large data from autonomous sources. To find meaningful and accurate data from large unstructured data, is a dreary task for any user. This is the reason why classification techniques came into picture for big data. With the help of classification methods unstructured data can be turned into organized form so that a user can access the required data easily. These classification techniques can be applied over big transactional databases to provide data services to users from large volume data sets. Classification is an aspect of machine learning and there are basically two broad categories: Supervised and unsupervised classification. In this paper we worked on to study variants of supervised classification methods. A comparison is also done on the basis of their advantages and limitations.