Sentence-Level and Document-Level Sentiment Mining for Arabic Texts

In this work, we investigate sentiment mining of Arabic text at both the sentence level and the document level. Existing research in Arabic sentiment mining remains very limited. For sentence-level classification, we investigate two approaches. The first is a novel grammatical approach that employs the use of a general structure for the Arabic sentence. The second approach is based on the semantic orientation of words and their corresponding frequencies, to do this we built an interactive learning semantic dictionary which stores the polarities of the roots of different words and identifies new polarities based on these roots. For document-level classification, we use sentences of known classes to classify whole documents, using a novel approach whereby documents are divided dynamically into chunks and classification is based on the semantic contributions of different chunks in the document. This dynamic chunking approach can also be investigated for sentiment mining in other languages. Finally, we propose a hierarchical classification scheme that uses the results of the sentence-level classifier as input to the document-level classifier, an approach which has not been investigated previously for Arabic documents. We also pinpoint the various challenges that are faced by sentiment mining for Arabic texts and propose suggestions for its development. We demonstrate promising results with our sentence-level approach, and our document-level experiments show, with high accuracy, that it is feasible to extract the sentiment of an Arabic document based on the classes of its sentences.