This paper presents techniques of retrieving useful information from a mixture of Web pages collected from either question-answer sites (Q&A sites) or Web search engines. The proposed techniques are designed to discover the maximum possible amount of know-how knowledge from such collections of Web pages, where know-how knowledge is defined as text contents qualified as information source regarding specific domain of questions. The major intent is to build a framework that selects helpful information to provide answers to various problems of interest, such as useful tips to a question. Techniques in this paper primarily attempt to complement knowledge available on Q&A sites with pages collected from search engines via topic models. In order to argue that pages collected from search engine are truly supplements to know- how knowledge on Q&A sites we verify how much extra useful information the Web search engine is able to provide by manually inspecting Web pages aggregated by the topic model.
[1]
Michael I. Jordan,et al.
Latent Dirichlet Allocation
,
2001,
J. Mach. Learn. Res..
[2]
Katsumi Tanaka,et al.
Searching the Web for Alternative Answers to Questions on WebQA Sites
,
2010,
WAIM.
[3]
Yiqun Liu,et al.
Overview of the NTCIR-11 IMine Task
,
2014,
NTCIR.
[4]
Takayuki Yumoto.
University of Hyogo at NTCIR-11 TaskMine by Dependency Parsing
,
2014,
NTCIR.
[5]
Shohei Mine,et al.
InteractiveMediaMINE at the NTCIR-11 IMine Search Task
,
2014,
NTCIR.