Automated Port-scan Classification with Decision Tree and Distributed Sensors

Computer worms randomly perform port scans to find vulnerable hosts to intrude over the Internet. Malicious software varies its port-scan strategy, e.g., some hosts intensively perform scans on a particular target and some hosts scan uniformly over IP address blocks. In this paper, we propose a new automated worm classification scheme from distributed observations. Our proposed scheme can detect some statistics of behavior with a simple decision tree consisting of some nodes to classify source addresses with optimal threshold values. The choice of thresholds is automated to minimize the entropy gain of the classification. Once a tree has been constructed, the classification can be done very quickly and accurately. In this paper, we analyze a set of source addresses observed by the distributed 30 sensors in ISDAS for a year in order to clarify a primary statistics of worms. Based on the statistical characteristics, we present the proposed classification and show the performance of the proposed scheme.