Detection of rare events within industrial datasets by means of data resampling and specific algorithms

The paper deals with the problem of the detection of rare patterns in unbalanced datasets coming from the industrial world. Such kind of patterns usually correspond to not frequent but very relevant events, such as the occurrence of product defects and machine faults. Within this work several approaches have been tested for the development of classifiers whose performance are able to meet the industrial requirements, i.e. a high rate of recognition of unfrequent patterns. Considered the unbalanced nature of the available datasets, most known techniques used for dealing with this kind of databases (i.e. resampling techniques and specific algorithms) have been investigated and assessed, subsequently the most promising ones have been combined in order to exploit their advantages. This latter combination led to satisfactory results which make the developed classifier usable in the industrial field.