A procedure to detect suspected patterns of fraudulent behavior in vehicle emissions tests performed by an accredited inspection body

The National Institute of Metrology, Quality and Technology of Brazil (Inmetro), established by government decree 49/2010, has changed the requirements for vehicle safety inspections on both light and heavy vehicles that are converted to run on natural gas. In addition, according to government decree Inmetro 49/2010, the General Coordination for Accreditation of Inmetro is responsible for accrediting Brazilian vehicle safety inspection bodies. In recent years, there have been news reports, complaints, and denouncement about fraud cases in these accredited inspections, which increases the risk of accidents and environmental damage. In this paper, we propose a procedure to detect suspected fraud by an accredited vehicle safety inspection body. This tool combines clustering, digital analysis, and descriptive statistics to consider indicators of anomalous behavior and dataset object attributes in the clustering process. This mixed clustering structure links objects with similar anomaly scores together. We used descriptive statistics to identify which groups of observations were more likely to be fraudulent than others. In experiments, the proposed procedure identified unusual patterns successfully.

[1]  Hui Guo,et al.  On-road remote sensing measurements and fuel-based motor vehicle emission inventory in Hangzhou, China , 2007 .

[2]  Peter Winker,et al.  A statistical approach to detect interviewer falsification of survey data , 2012 .

[3]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[4]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[5]  C. ウィアー,et al.  Fraud in Clinical Trials : Detecting it and preventing it , 2013 .

[6]  Eric R. Ziegel,et al.  Probability and Statistics for Engineering and the Sciences , 2004, Technometrics.

[7]  Okmyung Bin A logit analysis of vehicle emissions using inspection and maintenance testing data , 2003 .

[8]  B.N. Lakshmi,et al.  A conceptual overview of data mining , 2011, 2011 National Conference on Innovations in Emerging Technology.

[9]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[10]  Anton K. Formann,et al.  The Newcomb-Benford Law in Its Relation to Some Common Distributions , 2010, PloS one.

[11]  Shunzhi Zhu,et al.  Health care fraud detection using nonnegative matrix factorization , 2011, 2011 6th International Conference on Computer Science & Education (ICCSE).

[12]  Allen B. Downey Think Stats - Probability and Statistics for Programmers , 2011 .

[13]  Brett Lantz,et al.  Machine learning with R : learn how to use R to apply powerful machine learning methods and gain an insight into real-world applications , 2013 .

[14]  Martin Carlsson,et al.  Detecting data fabrication in clinical trials from cluster analysis perspective , 2011, Pharmaceutical statistics.

[15]  Wen-Hsi Chang,et al.  Analysis of fraudulent behavior strategies in online auctions for detecting latent fraudsters , 2014, Electron. Commer. Res. Appl..

[16]  Tom Wenzel,et al.  Some issues in the statistical analysis of vehicle emissions , 2000 .

[17]  P. Lachenbruch,et al.  The role of biostatistics in the prevention, detection and treatment of fraud in clinical trials. , 1999, Statistics in medicine.

[18]  Charu C. Aggarwal,et al.  Outlier Analysis , 2013, Springer New York.

[19]  Hong He,et al.  A two-stage genetic algorithm for automatic clustering , 2012, Neurocomputing.

[20]  Luigi Barone,et al.  Nature-Inspired Techniques in the Context of Fraud Detection , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[21]  James C. Bezdek,et al.  On cluster validity for the fuzzy c-means model , 1995, IEEE Trans. Fuzzy Syst..

[22]  M. Nigrini Benford's law : applications for forensic accounting, auditing, and fraud detection , 2012 .

[23]  G. Bishop,et al.  Automobile emissions are statistically gamma distributed. , 1994, Environmental science & technology.