Topic Modelling and Sentiment Analysis with the Bangla Language: A Deep Learning Approach Combined with the Latent Dirichlet Allocation
暂无分享,去创建一个
In this thesis, the Bangla language topic modelling and sentiment analysis has
been researched. It has two contributions lining up together. In this regard, we
have proposed different models for both the topic modelling and the sentiment
analysis task. Many research exist for both of these works but they do not address
the Bangla language. Topic modelling is a powerful technique for unsupervised
analysis of large document collections. There are various efficient topic modelling
techniques available for the English language as it is one of the most spoken languages
in the whole world, but not for the other spoken languages. Bangla being
the seventh most spoken native language in the world by population, it needs automation
in different aspects. This thesis deals with finding the core topics of the
Bangla news corpus and classifying news with a similarity measure which is one
of the contributions. This is the first ever tool for Bangla topic modelling. The
document models are built using LDA (Latent Dirichlet Allocation) with Bigram.
Over the recent years, people in Bangladesh are heavily getting involved in social
media with Bangla texts. Among this involvement, people post their opinion
about products or businesses across different social sites and Facebook is the most
weighted one. We have collected data from the Facebook Bangla comments and
applied a state of the art algorithm to extract the sentiments which is another
contribution. Our proposed system will demonstrate an efficient sentiment analysis.
We have performed a comparison analysis with the existing sentiment analysis
system in Bangla. However it is not straightforward to extract sentiments from
the Bengali language due to its complex grammatical structure. A deep learning
based method was applied to train the model and understand the underlying sentiment.
The main idea is confined to the word level and character level encoding
and in order to see the differences in terms of the model performance. So, we
will explore different algorithms and techniques for topic modelling and sentiment
analysis for the Bangla language.