ONSET: Opinion and Aspect Extraction System from Unlabelled Data
暂无分享,去创建一个
Online businesses are highly interested in finding practical solutions to opinion mining, but it is challenging to extract aspects and sentiments from the text. One way to solve this problem is to fine-tune good quality extractions from reviews using state-of-the-art pre-trained language models. However, such fine-tuned language models can produce good results if trained with a large amount of relevant data. In this paper, a technique that can fine-tune language models for opinion extractions using unlabelled training data. This paper proposes a novel opinion mining system called ONSET. This system is developed through a fine-tuned language model using an unsupervised learning approach to label aspects using topic modeling and then using semi-supervised learning with data augmentation. With extensive experiments performed during this research, the proposed model can achieve similar results as some state-of-the-art models produce with a high quantity of labelled training data. F1-scores of 87.30% and 88.35% are achieved on SemEval Aspect-Based Sentiment Analysis and Twitter datasets, respectively.