Implementasi Deep Learning untuk Entity Matching pada Dataset Obat (Studi Kasus K24 dan Farmaku)

Data processing speed in companies is important to speed up their analysis. Entity matching is a computational process that companies can perform in data processing. In conducting data processing, entity matching plays a role in determining two different data but referring to the same entity. Entity matching problems arise when the dataset used in the comparison is large. The deep learning concept is one of the solutions in dealing with entity matching problems. DeepMatcher is a python package based on a deep learning model architecture that can solve entity matching problems. The purpose of this study was to determine the matching between the two datasets with the application of DeepMatcher in entity matching using drug data from farmaku.com and k24klik.com. The comparison model used is the Hybrid model. Based on the test results, the Hybrid model produces accurate numbers, so that the entity matching used in this study runs well. The best accuracy value of the 10th training with an F1 value of 30.30, a precision value of 17.86, and a recall value of 100.

[1]  Vasilis Efthymiou,et al.  Entity resolution in the web of data , 2013, Entity Resolution in the Web of Data.

[2]  Hongzhang Xu,et al.  Deep learning in environmental remote sensing: Achievements and challenges , 2020, Remote Sensing of Environment.

[3]  Ral Garreta,et al.  Learning scikit-learn: Machine Learning in Python , 2013 .

[4]  Theodoros Rekatsinas,et al.  Deep Learning for Entity Matching: A Design Space Exploration , 2018, SIGMOD Conference.

[5]  Jungo Kasai,et al.  Low-resource Deep Entity Resolution with Transfer and Active Learning , 2019, ACL.

[6]  David M. W. Powers,et al.  Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation , 2011, ArXiv.

[7]  James D. Hollan,et al.  Exploration and Explanation in Computational Notebooks , 2018, CHI.

[8]  AnHai Doan,et al.  Data Curation with Deep Learning , 2020, EDBT.

[9]  Chen Chen,et al.  BigGorilla: An Open-Source Ecosystem for Data Preparation and Integration , 2018, IEEE Data Eng. Bull..

[10]  W. Tan,et al.  Deep entity matching with pre-trained language models , 2020, Proc. VLDB Endow..

[11]  Wishnu Hardi Mengukur kinerja search engine : sebuah eksperimentasi penilaian precision and recall untuk informasi ilmiah bidang ilmu perpustakaan dan informasi [Search Engines performance evaluation: an experimental the value of precision and recall for scientific information in LIS field.] , 2006 .

[12]  Xianpei Han,et al.  End-to-End Multi-Perspective Matching for Entity Resolution , 2019, IJCAI.

[13]  M. A. Sadeeq,et al.  Multimodal Emotion Recognition using Deep Learning , 2021, Journal of Applied Science and Technology Trends.

[14]  Yeye He,et al.  Auto-EM: End-to-end Fuzzy Entity-Matching using Pre-trained Deep Models and Transfer Learning , 2019, WWW.

[15]  Rizki Wahyudi,et al.  Optimasi SVM Berbasis PSO pada Analisis Sentimen Wacana Pindah Ibu Kota Indonesia , 2021 .