Purpose: This study analyzes automobile quality review data to develop alternative analytical method of informal data. Existing methods to analyze informal data are based mainly on the frequency of informal data, however, this research tries to use correlation information of each informal data. Method: After sentimental analysis to acquire the user information for automobile products, three classification methods, that is, naïve Bayes, random forest, and support vector machine, were employed to accurately classify the informal user opinions with respect to automobile qualities. Additionally, Word2vec was applied to discover correlated information about informal data. Result: As applicative results of three classification methods, random forest method shows most effective results compared to the other classification methods. Word2vec method manages to discover closest relevant data with automobile components. Conclusion: The proposed method shows its effectiveness in terms of accuracy and sensitivity on the analysis of informal quality data, however, only two sentiments (positive or negative) can be categorized due to human errors. Further studies are required to derive more sentiments to accurately classify informal quality data. Word2vec method also shows comparative results to discover the relevance of components precisely.
[1]
Jeffrey Dean,et al.
Distributed Representations of Words and Phrases and their Compositionality
,
2013,
NIPS.
[2]
이성직Sungjick Lee,et al.
Keyword Extraction from News Corpus using Modified TF-IDF
,
2009
.
[3]
Daniel T. Larose,et al.
Discovering Knowledge in Data: An Introduction to Data Mining
,
2005
.
[4]
Shixiong Xia,et al.
An Improved KNN Text Classification Algorithm Based on Clustering
,
2009,
J. Comput..
[5]
Seung Ryul Jeong,et al.
Predicting the Direction of the Stock Index by Using a Domain-Specific Sentiment Dictionary
,
2013
.
[7]
Yuen-Hsien Tseng,et al.
Text mining techniques for patent analysis
,
2007,
Inf. Process. Manag..
[8]
Jeffrey Dean,et al.
Efficient Estimation of Word Representations in Vector Space
,
2013,
ICLR.
[9]
Quoc V. Le,et al.
Distributed Representations of Sentences and Documents
,
2014,
ICML.
[10]
Han-Joon Kim,et al.
Keyword Extraction from News Corpus using Modified TF-IDF
,
2009
.