Measurement of Distance from Page Sequences Using Dynamic Programming
Internet is playing a vital role for accessing information, because lots of information is available on internet. Lots of data are rapidly growing, but the data which is resided on the web include irrelevant information, it contains different types of data format. Due to heterogeneity of data it is very challenging task to retrieve relevant information from web data. Using web usage mining technique, mine the relevant information from large amount of data available in the web logs format that enclose intrinsic information regarding web pages accessed. Because of this large amount of web log data, it is better to deal with small set of data at a time, instead of handling with whole data jointly. Now we need to find the distance between two user sessions, using some distance similarity function can be accomplish this kind of tasks. Clustering of users tends to establish groups of users exhibiting similar browsing patterns. In this paper we propose novel algorithm, for measuring the similarity between two user sessions based on sequence alignment that uses the Longest Common Subsequence method.