论文信息 - Data Mining Techniques for Explaining Social Events

Data Mining Techniques for Explaining Social Events

When trying to discover patterns and classification models for social events, machine learning can be a powerful tool. The most common usage of data mining techniques is categorization of new examples into specific classes. Nevertheless, the simple number indicating classification accuracy, as with SVM or similar non-transparent methods, is usually not good enough for the case when we want to understand the problem and check the obtained relations with human sense and knowledge. It is not good enough because we don’t know whether the relations beneath are logical to human experts of the research field or even we don’t know how the relations look like. We want to check the computerconstructed relations whether already known or created anew. In many cases when dealing with social events, it is of extreme importance to combine computer and human knowledge. Classification trees or classification rules seem to be the best choice for this kind of problems. The problem that might arise in this case is in the quality of the discovered patterns, e.g. it is well known that some computer-generated relations seem to be important, but statistically do not exceed the chance of random choice. That is why the procedure of conducting the best possible classification trees or rules from the data must follow certain rules. To put it shortly, first, data has to be manipulated in various, yet systematic ways in connection with opinions of a field expert. The manipulation can be executed on the level of instances, attributes, class or parameters of the data mining algorithm. Second, the quality estimation should be calculated in various ways, thus providing possibility to choose the best tree of all. We performed demographic analysis in the proposed way, obtaining some new and confirming some already published relations.

Gams Matjaz | Krivec Jana | K. Jana | Gams Matjaz