Extraction of Error Detection Rules without Supervised Information from Log Files Using Automatically Defined Groups

Our main aim is to extract multiple rules from log files in the computer systems, to detect various levels of errors, and to inform these errors or configuration mistakes to the system administrators automatically, in order to manage them without expert knowledge. To satisfy this aim, we performed an extraction experiment from the log files of a system using Automatically Defined Groups (ADG), which is based on Genetic Programming. Moreover, we focused on "System State Pattern" related to the difference between normal daily state and abnormal state that some errors occur in the system. In this experiment, then, we tried to extract rules without any manually managed and supervised information, by using simple translation technique: regular expressions. As a result, 50 agents in the best individual were divided into 16 groups from 322 log files. This means that 16 rules were acquired. We confirmed these rules could detect some errors such as DNS configuration error. We could also find the importance of the rules because the rule with more agents tended to have a higher adopted frequency by evolutionary computation. Therefore, we consider that our method using ADG is useful for the diagnosis of computer systems, and helps administrators manage their systems without expert knowledge about their systems.