It becomes more difficult to find valuable contents in the Web 2.0 environment since lots of inexperienced users provide many unorganized contents. In the previous researches, people has proved that non-text information such as the number of references, the number of supports, and the length of answers is effective to evaluate answers to a question in a online QnA service site. However, these features can be changed easily by users and cannot reflect social activity of users. In this paper, we propose a new method to evaluate user reputation using co-occurrence features between question and answers, and collective intelligence. If we are able to calculate user reputation, then we can estimate the worth of contents that has small number of reference and small number of support. We compute the user reputation using a modified PageRank algorithm. The experiment results show that our proposed method is effective and useful for identifying such contents.
[1]
Sergey Brin,et al.
The Anatomy of a Large-Scale Hypertextual Web Search Engine
,
1998,
Comput. Networks.
[2]
Hae-Chang Rim,et al.
A Comment Spam Filter System based on Inverse Chi-Square Using of Co-occurrence Feature Between Comment and Blog Post
,
.
[3]
Susan Gauch,et al.
Incorporating quality metrics in centralized/distributed information retrieval on the World Wide Web
,
2000,
SIGIR '00.
[4]
W. Bruce Croft,et al.
Document quality models for web ad hoc retrieval
,
2005,
CIKM '05.
[5]
W. Bruce Croft,et al.
A framework to predict the quality of answers with non-textual features
,
2006,
SIGIR.
[6]
Andreas Hotho,et al.
Information Retrieval in Folksonomies: Search and Ranking
,
2006,
ESWC.