Clustering algorithm of categorical data in consideration of sorting by weight

Aimed at solving the problem that part of clustering algorithms are sensitive to the data input order,a non-interference sequence index was defined,and an approach applying the non-interference sequence was proposed to sort categorical data by weight.Based on this approach,a new clustering algorithm considering sorting by weight(CABOSFV CSW) was presented to improve CABOSFV C,which is an effcient clustering algorithm for categorical data but sensitive to the data input order.This approach eliminates sensitivity to the data input order.UCI benchmark data sets were used to compare the proposed CABOSFV CSW algorithm with traditional CABOSFV C algorithm and other algorithms sensitive to the data input order.Empirical tests show that the new CABOSFV CSW clustering algorithm for categorical data improves the accuracy and increases the stability effectively.