Business Trends Based on News Portal Websites for Analysis of Big Data Using K-Means Clustering

Business analysis is performed to determine the business that are popular, in Indonesia with text mining can take data from several news portal in Indonesia. Text preprocessing is used to change the text title and tags on the news to be converted into weights. The weight of the data will be processed using the K-Means algorithm to be grouped into clusters and each cluster will be visualized using Word Cloud so that words that often appear as popular word identification are known. Testing uses the Silhouette Coefficient to calculate the quality of each member against the cluster. Furthermore, each member will be interpreted according to the test results. Analysis is carried out every month in 2018 with a total of 995 data with a monthly average of 6 clusters, in January were the most popular business according to the number of members from 64 data formed 6 clusters, the most member clusters were cluster 1 the Silhouette Coefficient test results are strong 0.00%, medium 65.22%, weak 30.43%, not substantial 4.35%, Word Cloud formed was a leather bag business.