Discovering Semantic Proximity for Web Pages

Dynamic Nearness is a data mining algorithm that detects semantic relationships between objects in a database, based on access patterns. This approach can be applied to web pages to allow automatic dynamic reconfiguration of a web site. Worst-case storage requirements for the algorithm are quadratic (in the number of web pages), but practical reductions, such as ignoring a few long transactions that provide little information, drop storage requirements to linear. Thus, dynamic nearness scales to large systems. The methodology is validated via experiments run on a moderately-sized existing web site.