论文信息 - Assessing Short Answers in Indonesian Using Semantic Text Similarity Method and Dynamic Corpus

Assessing Short Answers in Indonesian Using Semantic Text Similarity Method and Dynamic Corpus

Automatic assessment of short answers is one of the Computer Assisted Test works that can assess answers in natural language. Several methods have been used to create a system capable of assessing short answers that are close to human markings. In Indonesian, it might be easy to use string-based similarity methods by matching keywords, as has been done in previous studies. However, short answers have characteristics that focus on content, question type, and answer length, which cannot be accommodated only by string-based methods. This study aims to implement a hybrid method using corpus and string-based similarities. The Semantic Text Similarity (STS) method was used in this study to assess short answers in Indonesian. The STS method consists of three combinations of similarity methods, namely Normalized and Modified Longest Common Subsequence, Second Order Co-occurrence Pointwise Mutual Information, and Common Word Order Similarity. We also use a dynamic corpus with the advantage of being relatively small and adaptable to the learning domain. The Gensim Module is used to generate a dynamic corpus. The dynamic corpus uses the top five answers from students obtained from the Gensim module. The STS method is compared with the Cosine Similarity method since Cosine Similarity is the most commonly used method to assess answers in Indonesian. The results show that the STS method can outperform the Cosine Similarity method based on the Mean Absolute Error value, but still not outperformed in terms of correlation.

Uswatun Hasanah | Bambang Pilu Hartato

[1] Petr Sojka,et al. Gensim -- Statistical Semantics in Python , 2011 .

[2] Meiyanto Eko Sulistyo,et al. PENILAIAN UJIAN BERTIPE ESSAY MENGGUNAKAN METODE TEXT SIMILARITY , 2015 .

[3] Sally Jordan,et al. Short-answer e-assessment questions: five years on , 2012 .

[4] Thomas Eckart,et al. Building Large Monolingual Dictionaries at the Leipzig Corpora Collection: From 100 to 200 Languages , 2012, LREC.

[5] Diana Inkpen,et al. Semantic text similarity using corpus-based word similarity and string similarity , 2008, ACM Trans. Knowl. Discov. Data.

[6] Benno Stein,et al. The Eras and Trends of Automatic Short Answer Grading , 2015, International Journal of Artificial Intelligence in Education.

[7] Nurul Hidayat,et al. Penilaian Ujian Otomatis untuk Soal Bertipe Essay pada PJJ APTIKOM menggunakan Cosine Similarity , 2019 .

[9] Rizki Wahyudi,et al. An Experimental Study of Text Preprocessing Techniques for Automatic Short Answer Grading in Indonesian , 2018, 2018 3rd International Conference on Information Technology, Information System and Electrical Engineering (ICITISEE).

[10] Diana Inkpen,et al. Second Order Co-occurrence PMI for Determining the Semantic Similarity of Words , 2006, LREC.

[11] Gunawan Gunawan,et al. Sistem Penilaian Otomatis Jawaban Esai Menggunakan Metode GLSA , 2018 .

[12] Sri Suning Kusumawardani,et al. A scoring rubric for automatic short answer grading system , 2019 .