Cluster Scoring Method for Keyword Search with Relevancy-Connected Cluster Model Algorithm

Relevancy-Connected Cluster Model has four steps to be executed in order to search keywords over graphs. On its last step, the outcome clusters are sorted using kernel function. Top ranked cluster covers more diverse input keywords and has small total number of edges. However, there is no specific rule about which of those criteria should be fulfilled first. A cluster which contains more input keywords found is likely to have more edges and vice versa. To tackle that issue, we propose a cluster scoring method which is built upon the combination of the total number of found keyword variations and the total number of edges traced by each core node within clusters. We also examine the r value since it is used on sorting clusters and has significant influence throughout the algorithm.