论文信息 - Applying Machine Learning Diversity Metrics to Data Fusion in Information Retrieval

Applying Machine Learning Diversity Metrics to Data Fusion in Information Retrieval

The Supervised Machine Learning task of classification has parallels with Information Retrieval (IR): in each case, items (documents in the case of IR) are required to be categorised into discrete classes (relevant or non-relevant). Thus a parallel can also be drawn between classifier ensembles, where evidence from multiple classifiers are combined to achieve a superior result, and the IR data fusion task. This paper presents preliminary experimental results on the applicability of classifier ensemble diversity metrics in data fusion. Initial results indicate a relationship between the quality of the fused result set (as measured by MAP) and the diversity of its inputs

[1] Fabio Roli,et al. Multiple Classifier Systems, 9th International Workshop, MCS 2010, Cairo, Egypt, April 7-9, 2010. Proceedings , 2010, MCS.

[2] Jong-Hak Lee,et al. Analyses of multiple evidence combination , 1997, SIGIR '97.

[3] John Dunnion,et al. Extending Probabilistic Data Fusion Using Sliding Windows , 2008, ECIR.

[4] Thomas G. Dietterich. Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[5] Ludmila I. Kuncheva,et al. Relationships between combination methods and measures of diversity in combining classifiers , 2002, Inf. Fusion.

[6] Peter Ingwersen,et al. Developing a Test Collection for the Evaluation of Integrated Search , 2010, ECIR.

[7] C. J. Whitaker,et al. Ten measures of diversity in classifier ensembles: limits for two classifiers , 2001 .