Abnormal telephone identification via an ensemble-based classification framework

Abnormal telephone that often appeared in our daily life has nearly crazed all people. Most of existing solutions to the problem cannot identify them in time with low efficiency and poor prediction. By cooperating with a telecom company, we have collected a certain amount of telecom user data. But, by analysis on the data, we found that it is a special classification problem in the case of small-scale data with noise samples. In this paper, we propose an ensemble-based classification framework, which first generates multiple training sets by partly resampling on the original data, then build classifiers for every of the generated training sets and finally combines the classifiers in the ensemble way. We conduct experiments on real-life data to evaluate the performance of our framework in comparison with some practical classification algorithms. The empirical result and analysis demonstrate that our framework can achieve a significant increase in accuracy for such difficult prediction task, especially when having noise samples.

[1]  Lior Rokach,et al.  Ensemble-based classifiers , 2010, Artificial Intelligence Review.

[2]  Bertrand Clarke,et al.  Comparing Bayes Model Averaging and Stacking When Model Approximation Error Cannot be Ignored , 2003, J. Mach. Learn. Res..

[3]  Rakesh Kumar Jha,et al.  Device-to-Device Communication in Cellular Networks: A Survey , 2016, J. Netw. Comput. Appl..

[4]  D. Kibler,et al.  Instance-based learning algorithms , 2004, Machine Learning.

[5]  Saurabh Bagchi,et al.  Spam detection in voice-over-IP calls through semi-supervised clustering , 2009, 2009 IEEE/IFIP International Conference on Dependable Systems & Networks.

[6]  Christoph Egger,et al.  Using E-Mail SPAM DNS Blacklists for Qualifying the SPAM-over-Internet-Telephony Probability of a SIP Call , 2009, 2009 Third International Conference on Digital Society.

[7]  Zheng Yan,et al.  A Survey on Security in D2D Communications , 2017, Mob. Networks Appl..

[8]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[9]  Guoqiang Peter Zhang,et al.  Neural networks for classification: a survey , 2000, IEEE Trans. Syst. Man Cybern. Part C.

[10]  Rodney X. Sturdivant,et al.  Applied Logistic Regression: Hosmer/Applied Logistic Regression , 2005 .

[11]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[12]  Hai Huang,et al.  A SPIT Detection Method Using Voice Activity Analysis , 2009, 2009 International Conference on Multimedia Information Networking and Security.

[13]  Ethem Alpaydin,et al.  Introduction to machine learning , 2004, Adaptive computation and machine learning.

[14]  Bernhard Schölkopf,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[15]  Adam Doupé,et al.  SoK: Everyone Hates Robocalls: A Survey of Techniques Against Telephone Spam , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[16]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[17]  Trevor Hastie,et al.  Multi-class AdaBoost ∗ , 2009 .