A Persian Web Page Classifi er Applying a Combination of Content-Based and Context-Based Features

There are many automatic classifi cation methods and algorithms that have been propose for content-based or context-based features of web pages. In this paper we analyze these features and try to exploit a combination of features to improve categorization accuracy of Persian web page classifi cation. In this work we have suggested a linear combination of different features and adjusting the optimum weighing during application. To show the outcome of this approach, we have conducted various experiments on a dataset consisting of all pages belonging to Persian Wikipedia in the fi eld of computer. These experiments demonstrate the usefulness of using content-based and context-based web page features in a linear weighted combination.