Extracting Structured Knowledge for Semantic Web by Mining Wikipedia

Since Wikipedia has become a huge scale database storing wide-range of human knowledge, it is a promising corpus for knowledge extraction. A considerable number of researches on Wikipedia mining have been conducted and the fact that Wikipedia is an invaluable corpus has been confirmed. Wikipedia's impressive characteristics are not limited to the scale, but also include the dense link structure, URI for word sense disambiguation, well structured Infoboxes, and the category tree. One of the popular approaches in Wikipedia Mining is to use Wikipedia's category tree as an ontology and a number of researchers proved that Wikipedia's categories are promising resources for ontology construction by showing significant results. In this work, we try to prove the capability of Wikipedia as a corpus for knowledge extraction and how it works in the Semantic Web environment. We show two achievements; Wikipedia Thesaurus, a huge scale association thesaurus by mining the Wikipedia's link structure, and Wikipedia Ontology, a Web ontology extracted by mining Wikipedia articles.