Hot item mining and summarization from multiple auction Web sites

Online auction Web sites are fast changing, highly dynamic, and complex as they involve tremendous sellers and potential buyers, as well as a huge amount of items listed for bidding. We develop a two-phase framework which aims at mining and summarizing hot items from multiple auction Web sites to assist decision making. The objective of the first phase is to automatically extract the product features and product feature values of the items from the descriptions provided by the sellers. We design a HMM-based learning method to train an extended HMM model which can adapt to the unseen Web page from which the information is extracted. The goal of the second phase is to discover and summarize the hot items based on the extracted information. We formulate the hot item mining task as a semi-supervised learning problem and employ the graph mincuts algorithm to accomplish this task. The summary of the hot items is then generated by considering the frequency and the position of the product features being mentioned in the descriptions. We have conducted extensive experiments from several real-world auction Web sites to demonstrate the effectiveness of our framework.