IRTED-TL: An Inter-Region Tax Evasion Detection Method Based on Transfer Learning

Tax evasion detection plays a crucial role in addressing tax revenue loss. Many efforts have been made to develop tax evasion detection models by leveraging machine learning techniques, but they have not constructed a uniform model for different geographical regions because an ample supply of training examples is a fundamental prerequisite for an effective detection model. When sufficient tax data are not readily available, the development of a representative detection model is more difficult due to unequal feature distributions in different regions. Existing methods face a challenge in explaining and tracing derived results. To overcome these challenges, we propose an Inter-Region Tax Evasion Detection method based on Transfer Learning (IRTED-TL), which is optimized to simultaneously augment training data and induce interpretability into the detection model. We exploit evasion-related knowledge in one region and leverage transfer learning techniques to reinforce the tax evasion detection tasks of other regions in which training examples are lacking. We provide a unified framework that takes advantage of auxiliary data using a transfer learning mechanism and builds an interpretable classifier for inter-region tax evasion detection. Experimental tests based on real-world tax data demonstrate that the IRTED-TL can detect tax evaders with higher accuracy and better interpretability than existing methods.

[1]  Yiqiang Chen,et al.  Cross-People Mobile-Phone Based Activity Recognition , 2011, IJCAI.

[2]  Dimitrios Hristu-Varsakelis,et al.  A decision support model for tax revenue collection in Greece , 2012, Decis. Support Syst..

[3]  Ivor W. Tsang,et al.  Domain Adaptation via Transfer Component Analysis , 2009, IEEE Transactions on Neural Networks.

[4]  Qinghua Zheng,et al.  Mining Suspicious Tax Evasion Groups in Big Data , 2016, IEEE Transactions on Knowledge and Data Engineering.

[5]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[6]  Eduardo Tapia,et al.  An Agent-Based Model of Tax Compliance: an Application to the Spanish Case , 2013, Adv. Complex Syst..

[7]  You-Shyang Chen,et al.  A Delphi-based rough sets fusion model for extracting payment rules of vehicle license tax in the government sector , 2010, Expert Syst. Appl..

[8]  Naoki Abe,et al.  Optimizing debt collections using constrained reinforcement learning , 2010, KDD.

[9]  Graham J. Williams,et al.  Exploratory Multilevel Hot Spot Analysis: Australian Taxation Office Case Study , 2007, AusDM.

[10]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[11]  João Balsa,et al.  Agents that collude to evade taxes , 2007, AAMAS '07.

[12]  Philip S. Yu,et al.  Domain Invariant Transfer Kernel Learning , 2015, IEEE Transactions on Knowledge and Data Engineering.

[13]  Eric Eaton,et al.  Selective Transfer Between Learning Tasks Using Task-Based Boosting , 2011, AAAI.

[14]  Yuan Shi,et al.  Geodesic flow kernel for unsupervised domain adaptation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Raymond J. Mooney,et al.  Mapping and Revising Markov Logic Networks for Transfer Learning , 2007, AAAI.

[16]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[17]  Ping Wang,et al.  Two Birds with One Stone: Two-Factor Authentication with Security Beyond Conventional Bound , 2018, IEEE Transactions on Dependable and Secure Computing.

[18]  Una-May O'Reilly,et al.  Tax non-compliance detection using co-evolution of tax evasion risk and audit likelihood , 2015, ICAIL.

[19]  Foster J. Provost,et al.  Corporate residence fraud detection , 2014, KDD.

[20]  Manish Gupta,et al.  Audit Selection Strategy for Improving Tax Compliance – Application of Data Mining Techniques , 2007 .

[21]  Xiaoqing Liu,et al.  Application of Hierarchical Clustering in Tax Inspection Case-Selecting , 2010, 2010 International Conference on Computational Intelligence and Software Engineering.

[22]  Jing Liu,et al.  Improved privacy-preserving authentication scheme for roaming service in mobile networks , 2014, 2014 IEEE Wireless Communications and Networking Conference (WCNC).

[23]  Juan D. Velásquez,et al.  Characterization and detection of taxpayers with false invoices using data mining techniques , 2013, Expert Syst. Appl..

[24]  Zachary Eyler-Walker,et al.  Closing the gap: automated screening of tax returns to identify egregious tax shelters , 2006, SKDD.

[25]  Eduardo Tapia,et al.  Tax Compliance, Rational Choice, and Social Influence: An Agent-Based Model , 2014 .

[26]  Pedro M. Domingos,et al.  Deep transfer via second-order Markov logic , 2009, ICML '09.

[27]  Yi Yao,et al.  Boosting for transfer learning with multiple sources , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[28]  Una-May O'Reilly,et al.  Detecting tax evasion: a co-evolutionary approach , 2016, Artificial Intelligence and Law.

[29]  Erik Hemberg,et al.  Modeling tax evasion with genetic algorithms , 2014 .

[30]  Qiang Yang,et al.  Cross-domain sentiment classification via spectral feature alignment , 2010, WWW '10.

[31]  She-I Chang,et al.  Using data mining technique to enhance tax evasion detection performance , 2012, Expert Syst. Appl..

[32]  Wei Fan,et al.  Actively Transfer Domain Knowledge , 2008, ECML/PKDD.

[33]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[34]  Karsten M. Borgwardt,et al.  Covariate Shift by Kernel Mean Matching , 2009, NIPS 2009.

[35]  Zhenisbek Assylbekov,et al.  Detecting Value-Added Tax Evasion by Business Entities of Kazakhstan , 2016, KES-IDT.

[36]  Qiang Yang,et al.  Boosting for transfer learning , 2007, ICML '07.

[37]  Takafumi Kanamori,et al.  Density Ratio Estimation in Machine Learning , 2012 .

[38]  Debiao He,et al.  Robust Biometrics-Based Authentication Scheme for Multiserver Environment , 2015, IEEE Systems Journal.

[39]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.