Comparative Analysis of Information Extraction Techniques for Data Mining

Background/Objectives: This paper emphasizes the evolution of data processing adroitness to advanced data processing taxonomy from Mesolithic to recent years and a comparative study of prevailing tools/techniques which are useful for mainly the analysis of the bulky data. Methods/Statistical Analysis: There are various kinds of methods adapted by researchers for analysis of large amount of data. Each method varies on the basis of their different parameters and datasets according to their needs. These methods are implemented on HDFS, Mapreduce and Hadoop environment with integration of R tool. Some Methods are enhanced by the sentimental analysis through NLP which increase the performance of density analysis. Findings: The data or associated facts have been in existence right with the birth of human species. It commenced with manual illustration and gradually advanced through current state-of the art storage and processing. Big data involves novel techniques to manage information within limited run time. Big data is acutely beneficial in ventures growth, society incumbency and scientific research. The paper provides an overview of state of the art and focuses on the usage of conventional tools as well as advanced tools and techniques for effective information extraction. Applications/Improvements: To handle this prodigious data, there is a need to upgrade from the traditional data filtering techniques and adopt the new big data diagnostic tools.

[1]  N. B. Anuar,et al.  The rise of "big data" on cloud computing: Review and open research issues , 2015, Inf. Syst..

[2]  Xindong Wu,et al.  Data mining with big data , 2014, IEEE Transactions on Knowledge and Data Engineering.

[3]  Lamia Mamlouk,et al.  Big Data and Intrusiveness: Marketing Issues , 2015 .

[4]  Han Zhao,et al.  An Algorithm Research for Prediction of Extreme Learning Machines Based on Rough Sets , 2013, J. Comput..

[5]  Hui He,et al.  Optimization strategy of Hadoop small file storage for big data in healthcare , 2015, The Journal of Supercomputing.

[6]  C. L. Philip Chen,et al.  Data-intensive applications, challenges, techniques and technologies: A survey on Big Data , 2014, Inf. Sci..

[7]  M. Lakshmi,et al.  Sensitivity Analysis for Safe Grainstorage using Big Data , 2015 .

[8]  Yusuf Kavurucu,et al.  Hadoop Ecosystem and Its Analysis on Tweets , 2015 .

[9]  N. Karthick,et al.  An Improved Method for Handling and Extracting useful Information from Big Data , 2015 .

[10]  Seong Taek Park,et al.  A Study on Plan to Improve Illegal Parking using big Data , 2015 .

[11]  E. Sivasankar,et al.  Distributed pattern matching and document analysis in big data using Hadoop MapReduce model , 2014, 2014 International Conference on Parallel, Distributed and Grid Computing.

[12]  M. Balamurugan,et al.  Unique Sense: Smart Computing Prototype for Industry 4.0 Revolution with IOT and Bigdata Implementation Model , 2016, ArXiv.

[13]  H. Hollerith An Electric Tabulating System , 1982 .

[14]  Noh Kyoo-sung,et al.  Bigdata Platform Design and Implementation Model , 2015 .

[15]  Tzung-Pei Hong,et al.  Efficiently Maintaining the Fast Updated Sequential Pattern Trees With Sequence Deletion , 2014, IEEE Access.

[16]  Keun Won Kim,et al.  The Improvement Plan for Fire Response Time using Big Data , 2015 .

[17]  J. Manyika Big data: The next frontier for innovation, competition, and productivity , 2011 .

[18]  Rajkumar Buyya,et al.  Big Data computing and clouds: Trends and future directions , 2013, J. Parallel Distributed Comput..

[19]  Samuel Messick,et al.  A PUNCHED CARD PROCEDURE FOR THE METHOD OF SUCCESSIVE INTERVALS , 1955 .

[20]  Vipin Kumar,et al.  Trends in big data analytics , 2014, J. Parallel Distributed Comput..

[21]  R. Logesh,et al.  Unstructured Data Analysis on Big Data Using Map Reduce , 2015 .

[22]  N. G. Kim,et al.  Study on the Impact of Big Data Traffic Caused by the Unstable Routing Protocol , 2015 .

[23]  Feng Liu,et al.  Monitoring and analyzing big traffic data of a large-scale cellular network with Hadoop , 2014, IEEE Network.

[24]  Manuele Kirsch-Pinheiro,et al.  The 6th International Conference on Ambient Systems, Networks and Technologies (ANT 2015) Context-Aware Scheduling for Apache Hadoop over Pervasive Environments , 2015 .

[25]  Shaowei Xia,et al.  Adaptive interval configuration to enhance dynamic approach for mining association rules , 1999 .

[26]  Zhang Lijun,et al.  Research and application of information retrieval techniques in Intelligent Question Answering System , 2011, 2011 3rd International Conference on Computer Research and Development.

[27]  Mauro Iacono,et al.  Performance evaluation of NoSQL big-data applications using multi-formalism models , 2014, Future Gener. Comput. Syst..

[28]  Ian T. Foster,et al.  Ophidia: Toward Big Data Analytics for eScience , 2013, ICCS.

[29]  Young-Im Cho,et al.  Integrating of Data Using the Hadoop and R , 2015, FNC/MobiSPC.

[30]  C. K. Jha,et al.  Handling Big Data Efficiently by Using Map Reduce Technique , 2015, 2015 IEEE International Conference on Computational Intelligence & Communication Technology.

[31]  S. Dhamodaran,et al.  Big Data Implementation of Natural Disaster Monitoring and Alerting System in Real Time Social Network using Hadoop Technology , 2015 .

[32]  Ashiq Anjum,et al.  Cloud Based Big Data Analytics for Smart Future Cities , 2013, UCC.

[33]  Inderveer Chana,et al.  A survey of clustering techniques for big data analysis , 2014, 2014 5th International Conference - Confluence The Next Generation Information Technology Summit (Confluence).

[34]  G. Somasekhar,et al.  The Pre Big Data Matching Redundancy Avoidance Algorithm with Mapreduce , 2015 .

[35]  Kyoo-Sung Noh Plan for Vitalisation of Application of Big Data for e-Learning in South Korea , 2015 .

[36]  V. Rajalakshmi,et al.  Anonymization by Data Relocation Using Sub-clustering for Privacy Preserving Data Mining , 2014 .

[37]  Cevriye Gencer,et al.  Yesterday, Today and Tomorrow of Big Data , 2015 .

[38]  Yonggang Wen,et al.  Toward Scalable Systems for Big Data Analytics: A Technology Tutorial , 2014, IEEE Access.

[39]  P. Yasodha,et al.  Analysing Big Data to Build Knowledge Based System for Early Detection of Ovarian Cancer , 2015 .