Web mining: extraction on information and knowledge discovery from the entreprise websites

Practical effective use of the enormous quantity of data available on the web is the focus for lots of researchers. Our article lays the framework for discovering possible profitable collaborative networks among firms via information available on the internet. This uncovered knowledge is the primary reason why companies attempt to co-operate. In order to provide this knowledge discovery, it is essential to identify each of the activity fields and skills or "savoir faire" of these business. Presented in this article is a Web Mining approach founded on an application for gathering and processing textual corpora. Its base is derived from the companies' own websites. The aim of the work is to detect automatically the NAF1 code (Nomenclature of French Activity) of an enterprise by exploring only its website. Then, similarity measures can be compared. Our developments are based on an original method. Evaluation tests have been done and are very encouraging.