Study on Web text categorization and algorithm based on machine learning

A solution for web text categorization information retrieval based on machine learning is put forward.We adopt level constraint to realize text-crawled function,and apply the feature selections from the combination of document frequency and term frequency to fulfill the feature extraction.The features are weighted to improve the performance of text categorization.The algorithm can realize automatic Chinese text categorization,improve the precision of web information retrieval and greatly decrease the amount of work for browsing and filtering.It can also be used for the automatic categorization of E-government and E-commerce information.