Method of Detecting a Fictitious Company on the Machine Learning Base

The role of fictitious firms, which are also conventionally called “anonymous commercial structures”, is very high in committing economic crimes. The problem of their detection is complicated by the fact that they break the chains of economic and financial relations between real enterprises, banks, insurance companies, exporters and importers, trade enterprises. To solve this problem, the authors have developed a method based on machine learning, which detects fraudulent activities of a fictitious enterprise (sham business). Based on the analysis of the existing relevant publications the parameters of detecting the fictitious enterprises are formed. As a result of modeling the economic activity of 100 Ukrainian enterprises, 20 fictitious enterprises were identified by the methods of Support Vector Classifier, Stochastic Gradient Decent Classifier, Random Forest Classifier, Decision Tree Classifier, Gaussian Naive Bayes, K-Neighbors Classifier, Ada Boost Classifier, Logistic Regression. The results of experimental studies have shown that all selected methods of classification have an acceptable result. However, the best are Support Vector Classifier, Gaussian Naive Bayes, Logistic Regression with a forecast score of 0.98 and a standard deviation of 0.02.

[1]  Philip H. Swain,et al.  Purdue e-Pubs , 2022 .

[2]  N. Passas Cross-border Crime and the Interface between Legal and Illegal Actors , 2003 .

[3]  Francisco Herrera,et al.  kNN-IS: An Iterative Spark-based design of the k-Nearest Neighbors classifier for big data , 2017, Knowl. Based Syst..

[4]  V. I. Glotov,et al.  Population in the shadow market: petty corruption and unpaid taxes , 2018, Entrepreneurship and Sustainability Issues.

[5]  R. Bird,et al.  Tax Avoidance as a Sustainability Problem , 2018 .

[6]  K. Venkatachalapathy,et al.  Comparison of Predicting Student‘s Performance using Machine Learning Algorithms , 2019 .

[7]  Biswajeet Pradhan,et al.  Novel ensembles of COPRAS multi-criteria decision-making with logistic regression, boosted regression tree, and random forest for spatial prediction of gully erosion susceptibility. , 2019, The Science of the total environment.

[8]  Hong Wen,et al.  Adaboost-based security level classification of mobile intelligent terminals , 2019, The Journal of Supercomputing.

[9]  Zhixin Kang,et al.  Using Machine Learning Algorithms to Predict First-generation College Students’ Six-year Graduation: A Case Study , 2019, International Journal of Information Technology and Computer Science.

[10]  Anatoliy Sachenko,et al.  Assessing the Investment Risk of Virtual IT Company Based on Machine Learning , 2020, DSMP.

[11]  M. Thamarai,et al.  House Price Prediction Modeling Using Machine Learning , 2020, International Journal of Information Engineering and Electronic Business.

[12]  Petter Gottschalk Economical Motive , 2020, The Convenience of White-Collar Crime in Business.

[13]  M. Levi Evaluating the Control of Money Laundering and Its Underlying Offences: the Search for Meaningful Data , 2020, Asian journal of criminology.

[14]  Shabib Aftab,et al.  A Classification Framework for Software Defect Prediction Using Multi-filter Feature Selection Technique and MLP , 2020, International Journal of Modern Education and Computer Science.