Improved Reinforcement-based Profile Learning for Document Filtering

Summary A personalized information filtering system tailors user queries to the current user interests and adapt the information as they change over time. The system monitors a stream of incoming documents to learn user information needs in the form of profiles and yield relevant documents filtered to only those matches the user profiles. To learn the profile, the significance of query terms will be accessed and weights will be assigned to each term in the profile. This article proposed purity terms weighting method for profile learning in a personalized information filtering system. The main idea is to weigh the terms based on their pure frequencies, in addition to the number of pure relevant documents that contain them. The profiles are discriminated based on top weighed terms that represent the profiles. Profiles are also updated with every selected relevant document in order to match user interests. The efficiency of the proposed method is measured by using linear utility accuracy tested on TREC 2002 filtering track. The experimental results show improvement in terms selection and profile building accuracy as compared with Rocchio’s Algorithm, Okapi/BSS Basic Search System, and the incremental profile learning approach.