An Algorithm of Web Text Clustering Analysis Based on Fuzzy Set

There are a large quantity of non-certain and non-structure contents in the Web text at the present time. It is difficult to cluster the text by some normal classification methods. An algorithm of Web text clustering analysis based on fuzzy set is proposed in this paper, and the algorithm has been described in detail by example. The technique can improve the algorithm complexity of time and space, increase the robustness of the algorithm. To check the accuracy and efficiency of the algorithm, the comparative analysis of the sample and test data is provided in the end.

[1]  Humberto Bustince,et al.  Mathematical analysis of interval-valued fuzzy relations: Application to approximate reasoning , 2000, Fuzzy Sets Syst..

[2]  Yiming Yang,et al.  A Loss Function Analysis for Classification Methods in Text Categorization , 2003, ICML.

[3]  Aly A. Farag,et al.  A modified fuzzy c-means algorithm for bias field estimation and segmentation of MRI data , 2002, IEEE Transactions on Medical Imaging.

[4]  Ajith Abraham,et al.  Web usage mining using artificial ant colony clustering and linear genetic programming , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[5]  Yanchun Zhang,et al.  Using probabilistic latent semantic analysis for Web page grouping , 2005, 15th International Workshop on Research Issues in Data Engineering: Stream Data Mining and Applications (RIDE-SDMA'05).

[6]  Jaideep Srivastava,et al.  Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.

[7]  Yu Wang,et al.  Text categorization rule extraction based on fuzzy decision tree , 2005, 2005 International Conference on Machine Learning and Cybernetics.

[8]  Anupam Joshi,et al.  Fuzzy clustering for intrusion detection , 2003, The 12th IEEE International Conference on Fuzzy Systems, 2003. FUZZ '03..

[9]  Andries Petrus Engelbrecht,et al.  Data clustering using particle swarm optimization , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[10]  Boqin Feng,et al.  A Fuzzy Integral Method to Merge Search Engine Results on Web , 2005, CIS.

[11]  Yo-Ping Huang,et al.  Identifying a fuzzy model by using the bipartite membership functions , 2001, Fuzzy Sets Syst..

[12]  James Allan,et al.  Interactive Information Retrieval Using Clustering and Spatial Proximity , 2004, User Modeling and User-Adapted Interaction.

[13]  Ronald N. Kostoff High Quality Information Retrieval for Improving the Conduct and Management of Research and Development , 2000, ISMIS.

[14]  AllanJames,et al.  Interactive Information Retrieval Using Clustering and Spatial Proximity , 2004 .

[15]  Kam-Fai Wong,et al.  A Chinese dictionary construction algorithm for information retrieval , 2002, TALIP.

[16]  Russell C. Eberhart,et al.  Gene clustering using self-organizing maps and particle swarm optimization , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[17]  Ah-Hwee Tan,et al.  A Comparative Study on Chinese Text Categorization Methods , 2000, PRICAI Workshop on Text and Web Mining.

[18]  Etienne E. Kerre,et al.  On the relationship between some extensions of fuzzy set theory , 2003, Fuzzy Sets Syst..

[19]  J. B. Rosen,et al.  Lower Dimensional Representation of Text Data Based on Centroids and Least Squares , 2003 .