AGELI: An integrated platform for the Assessment of national Genetic Evaluation results by Learning and Informing

We present an integrated platform for preprocessing, analysis, alarm issuing and presentation of national genetic evaluation data based on data-mining. Our goal is the integrated qualitative description of national genetic evaluation results, concerning three milk yield traits that constitute a critical issue in the range of services provided by Interbull. Although the standard method for quality assurance appears sufficiently functional (Klei et al., 2002), during the last years there has been a progress concerning an alternative validation method of genetic evaluation results using data-mining (Banos et al., 2003; Diplaris et al., 2004; Han and Kamber, 2000), potentially leading to inference on data quality. A new alarming technique based on multiple criteria was recently established in order to assess and assure data quality (Diplaris et al., 2004). The whole idea was to exploit datamining techniques, i.e. decision trees, and then apply a goodness of fit test to individual tree nodes and an F-test in corresponding nodes from consecutive evaluation runs, aiming at discovering possible abnormalities in bull proof distributions at various regions. In a previous report (Banos et al., 2003) predictions led to by associations discovered had been qualitatively compared to actual proofs and discrepancies had been confirmed using a data set with known errors.