Extracting earnings information from financial statements via genetic algorithms

One of the goals of financial statement analysis is to extract firm-value-relevant information from financial statements. The process by which this information is processed can be considered a black box. In making forecasts and reports, analysts examine financial statement variables and derived quantities and aggregate their information with outside information. The process is subjective and it is a stylized fact that some information is always purposely or by necessity left out. It is humanly impossible to coherently aggregate information from too many variables. Therefore, a methodology in which information is extracted automatically is potentially attractive not just to analysts but to lay investors. J. Ou and H. Penman (1989) proposed such an automated methodology. They developed the "Pr measure" to aggregate financial statement information and predict the signs of changes in annual company earnings (adjusted for drift). However, there are certain problems with Ou and Penman's method. We propose an alternative methodology of automatically extracting information from financial statements. In our method the accounting information is aggregated by industry, the method of constructing the set of predictor variables mitigates a potential loss of information, and the criterion that decides whether variables are useful is predictive power instead of statistical significance. Our methodology requires an alternative, more extensive search than that performed by Ou and Penman. To accomplish the task we use a genetic algorithm, a very efficient search and optimization tool.