A content independent domain abuse detection method

This paper proposes a series of language-independent domain name abuse detection features, including domain name string features, domain name registration features, domain name resolution features and domain name service features, and trains six pattern recognition algorithms in the corresponding feature space. To validate the effectiveness of extracted features and learning algorithms, a practical data set is constructed, and the performance of related features and learning algorithms are compared and analysed. The experimental results show that the multi-scale features extracted in this paper have good recognition ability. The Random Forest algorithm achieves the best comprehensive effect when only 8-dimensional fusion features are used, where F1-Measure and ROC Area reach 0.965 and 0.978, respectively.