A New Algorithm for Identifying Transactions Based on Alternately-recorded Log File

In this paper, the author puts forward a new algorithm for identifying transactions in the process of web usage mining, which uses the methods of finding maximal forward references and large reference sequences. After complete analysis of experiment results, raw log files, the way of using the internet by users and the limitation of the internet itself, the author assures the existence of the alternately recorded log files. With new characters in the raw data, the existent algorithm shows obviously incapability. The new algorithm is based on the DFS of the directed graph. Simulation experiment is done to test how effective the algorithm is when dealing with this new kind of log file.