On Biases in Estimating Multi-Valued Attributes

We analyse the basics of eleven measures for estimating the quality of the multivalued attributes. The values of information gain J-measure, gini-index and relevance tend to lin early increase with the number of values of an attribute. The values of gam-ratio dis tance measure, Relief and the weight of evidence decrease for informative attributes and increase for irrelevant attributes. The bias of the statistic tests based on the chi-square distribution is similar but these functions are not able to discriminate among the attributes of different quality. We also introduce a new func tion based on the MDL principle whose value slightly decreases with the increasing number of attributes values.