Optimal Algorithms for Generation of User Session Sequences Using Server Side Web User Logs

Identification of user session boundaries is one of the most important processes in the web usage mining for predictive prefetching of user next request based on their navigation behavior. This paper presents new techniques to identify user session boundaries by considering IPaddress, browsing agent, intersession and intrasession timeouts, immediate link analysis between referred pages and backward reference analysis without searching the whole tree representing the server pages. A complete set of user session sequences and the learning graph based on these user session sequences is also generated. Using this graph predictive prefetching is done. Comparison on the performance of the given approach with the existing reference length method and maximal reference method was done. Our analysis with different server's logs shows that our approach provides better results in terms of time complexity and precision to identify user session boundaries and also to generate all the relevant user session sequences.