论文信息 - A study on unstructured text mining algorithm through R programming based on data dictionary

A study on unstructured text mining algorithm through R programming based on data dictionary

Unlike structured data which are gathered and saved in a predefined structure, unstructured text data which are mostly written in natural language have larger applications recently due to the emergence of web 2.0. Text mining is one of the most important big data analysis techniques that extracts meaningful information in the text because it has not only increased in the amount of text data but also human being's emotion is expressed directly. In this study, we used R program, an open source software for statistical analysis, and studied algorithm implementation to conduct analyses (such as Frequency Analysis, Cluster Analysis, Word Cloud, Social Network Analysis). Especially, to focus on our research scope, we used keyword extract method based on a Data Dictionary. By applying in real cases, we could find that R is very useful as a statistical analysis software working on variety of OS and with other languages interface.

Jong Hwa Lee | Hyun-Kyu Lee | Hyun-Kyu Lee | Jong Hwa Lee

[1] Gill Grassie. 2014 —The perfect IP storm? , 2013 .

[2] Michael Hahsler,et al. Getting Things in Order: An Introduction to the R Package seriation , 2008 .

[3] Sin-Jae Kang. Constructing a Large Interlinked Ontology Network for the Web of Data , 2010 .

[4] Youngjoong Ko,et al. Extracting Comparative Elements for Korean Comparison Mining , 2011 .

[5] Hyun Kyu Lee,et al. An analysis of mobile communication environment by a socio-technical approach , 2013 .

[6] Gábor Csárdi,et al. The igraph software package for complex network research , 2006 .

[7] Yanchang Zhao. R and Data Mining: Examples and Case Studies , 2012 .

[8] Cedric E. Ginestet. ggplot2: Elegant Graphics for Data Analysis , 2011 .

[9] Ingo Feinerer. Introduction to the tm Package Text Mining in R , 2007 .

[10] Kurt Hornik,et al. Text Mining Infrastructure in R , 2008 .

[11] Chang-Ho Lee,et al. A study on the efficient patent search process using big data analysis tool R , 2013 .

[12] Min Song,et al. A Study on Differences of Contents and Tones of Arguments among Newspapers Using Text Mining Analysis , 2012 .