Implementasi Cosine Similarity dan Algoritma Smith-Watermanuntuk Mendeteksi Kemiripan Teks

One’s writing originality in academic world becomes more and more questionable along with the increasingly access to others’ writings due to files archiving technology development today, especially over the internet. Therefore, a text similarity detection system is required. Based on that problem, this research tries to provide the solution by developing an application with the concept of text mining which implements cosine similarity and Smith-Waterman algorithm to detect text similarity. Cosine similarity serves to measure text similarity based on words occurrence, while Smith-Waterman algorithm’s function is to calculate text similarity based on words sequence. Based on this research test result, the developed application successfully detects text similarity from very similar to very dissimilar pair of texts.