We present an integrated platform for preprocessing, analysis, alarm issuing and presentation of national genetic evaluation data based on data-mining. Our goal is the integrated qualitative description of national genetic evaluation results, concerning three milk yield traits that constitute a critical issue in the range of services provided by Interbull. Although the standard method for quality assurance appears sufficiently functional (Klei et al., 2002), during the last years there has been a progress concerning an alternative validation method of genetic evaluation results using data-mining (Banos et al., 2003; Diplaris et al., 2004; Han and Kamber, 2000), potentially leading to inference on data quality. A new alarming technique based on multiple criteria was recently established in order to assess and assure data quality (Diplaris et al., 2004). The whole idea was to exploit datamining techniques, i.e. decision trees, and then apply a goodness of fit test to individual tree nodes and an F-test in corresponding nodes from consecutive evaluation runs, aiming at discovering possible abnormalities in bull proof distributions at various regions. In a previous report (Banos et al., 2003) predictions led to by associations discovered had been qualitatively compared to actual proofs and discrepancies had been confirmed using a data set with known errors.
[1]
F Fikse,et al.
A method for verifying genetic evaluation results
,
2002
.
[2]
Pericles A. Mitkas,et al.
An alarm firing system for national genetic evaluation quality control
,
2004
.
[3]
Petra Perner,et al.
Data Mining - Concepts and Techniques
,
2002,
Künstliche Intell..
[4]
Periklis Mitkas,et al.
Quality control of national genetic evaluation results using data-mining techniques; a progress report
,
2003
.
[5]
J. Ross Quinlan,et al.
C4.5: Programs for Machine Learning
,
1992
.