Evolving Multi-Variate Time-Series Patterns for the Discrimination of Fraudulent Financial Filings

ABSTRACT This paper considers an application of evolutionary computation (EC) to classification and pattern discovery. In particular we present a genetic algorithm (GA) utilized to discriminate cases of potential financial statement fraud. Of key interest to us is the ability to distinguish multidimensional patterns over time. The GA evolves strings over a pattern definition language to define class boundaries and to select classification features. The language defined allows for 1) the integration of data across time and across a number of variables 2) the integration of quantitative as well as qualitative data 3) the direct evolution by genetic algorithm and 4) easy interpretation by human experts. The data and method are described and results presented. Results offer a 63% true positive rate with a false positive rate of 5%. These results compare favorably with other published results on comparable data. Our technique captures behaviors not evident from traditional data analysis methods. The output from our system has the additional benefit of being easily understood and utilized by experts and practitioners in the field. This makes our approach more desirable than other black-box solutions. These techniques provide a foundation for multidimensional behavior analysis of data from a variety of domains including, financial, biological, manufacturing and clinical.

[1]  Taek Mu Kwon,et al.  A Multilayered Perceptron Approach to the Prediction of the SEC's Investigation Targets , 1996, IEEE Trans. Neural Networks.

[2]  O. Gotoh An improved algorithm for matching biological sequences. , 1982, Journal of molecular biology.

[3]  Howard Mark Schilit Financial Shenanigans: How to Detect Accounting Gimmicks and Fraud in Financial Reports , 1993 .

[4]  Zbigniew Michalewicz,et al.  Genetic Algorithms + Data Structures = Evolution Programs , 1992, Artificial Intelligence.

[5]  Kenneth O. Cogger,et al.  Neural network detection of management fraud using published financial data , 1998, Intell. Syst. Account. Finance Manag..

[6]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[7]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[8]  Kathleen A. Kaminski,et al.  Can financial ratios detect fraudulent financial reporting , 2004 .

[9]  Tak-Chung Fu,et al.  An evolutionary approach to pattern-based time series segmentation , 2004, IEEE Transactions on Evolutionary Computation.

[10]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .

[11]  Robert W. Ingram,et al.  The Difference between Earnings and Operating Cash Flow as an Indicator of Financial Reporting Fraud , 1999 .

[12]  Xin Yao,et al.  A novel evolutionary data mining algorithm with applications to churn prediction , 2003, IEEE Trans. Evol. Comput..