Probabilistic combination of text classifiers using reliability indicators: models and results

The intuition that different text classifiers behave in qualitatively different ways has long motivated attempts to build a better metaclassifier via some combination of classifiers. We introduce a probabilistic method for combining classifiers that considers the context-sensitive reliabilities of contributing classifiers. The method harnesses reliability indicators---variables that provide a valuable signal about the performance of classifiers in different situations. We provide background, present procedures for building metaclassifiers that take into consideration both reliability indicators and classifier outputs, and review a set of comparative studies undertaken to evaluate the methodology.

[1]  Susan T. Dumais,et al.  Hierarchical classification of Web content , 2000, SIGIR '00.

[2]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[3]  James P. Callan,et al.  Training algorithms for linear text classifiers , 1996, SIGIR '96.

[4]  David Maxwell Chickering,et al.  Dependency Networks for Inference, Collaborative Filtering, and Data Visualization , 2000, J. Mach. Learn. Res..

[5]  Peter Jackson,et al.  Combining multiple classifiers for text categorization , 2001, CIKM '01.

[6]  Susan T. Dumais,et al.  Inductive learning algorithms and representations for text categorization , 1998, CIKM '98.

[7]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[8]  David E. Johnson,et al.  Maximizing Text-Mining Performance , 1999 .

[9]  Anil K. Jain,et al.  Classification of text documents , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[10]  Andrew McCallum,et al.  A comparison of event models for naive bayes text classification , 1998, AAAI 1998.

[11]  Tom Fawcett,et al.  Robust Classification for Imprecise Environments , 2000, Machine Learning.

[12]  Edward A. Fox,et al.  Combination of Multiple Searches , 1993, TREC.

[13]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[14]  David Maxwell Chickering,et al.  A Bayesian Approach to Learning Bayesian Networks with Local Structure , 1997, UAI.

[15]  Yiming Yang,et al.  Combining Multiple Learning Strategies for Effective Cross Validation , 2000, ICML.

[16]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[17]  W. Bruce Croft,et al.  Combining classifiers in text categorization , 1996, SIGIR '96.

[18]  Nicholas J. Belkin,et al.  The effect multiple query representations on information retrieval system performance , 1993, SIGIR.

[19]  Eric Horvitz,et al.  Bayesian Modality Fusion: Probabilistic Integration of Multiple Vision Algorithms for Head Tracking , 1999 .

[20]  David D. Lewis,et al.  A sequential algorithm for training text classifiers: corrigendum and additional data , 1995, SIGF.

[21]  Jeffrey Katzer,et al.  A study of the overlap among document representations , 1983, SIGIR '83.

[22]  D. Heckerman,et al.  Dependency networks for inference , 2000 .

[23]  Garrison W. Cottrell,et al.  Automatic combination of multiple ranked retrieval systems , 1994, SIGIR '94.

[24]  Ian H. Witten,et al.  Issues in Stacked Generalization , 2011, J. Artif. Intell. Res..

[25]  Yiming Yang,et al.  A re-examination of text categorization methods , 1999, SIGIR '99.

[26]  Wai Lam,et al.  A meta-learning approach for text categorization , 2001, SIGIR '01.

[27]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[28]  W. Bruce Croft,et al.  Combining Automatic and Manual Index Representations in Probabilistic Retrieval , 1995, J. Am. Soc. Inf. Sci..

[29]  Hinrich Schütze,et al.  Method combination for document filtering , 1996, SIGIR '96.