Across-Model Collective Ensemble Classification

Ensemble classification methods that independently construct component models (e.g., bagging) improve accuracy over single models by reducing the error due to variance. Some work has been done to extend ensemble techniques for classification in relational domains by taking relational data characteristics or multiple link types into account during model construction. However, since these approaches follow the conventional approach to ensemble learning, they improve performance by reducing the error due to variance in learning. We note however, that variance in inference can be an additional source of error in relational methods that use collective classification, since inferred values are propagated during inference. We propose a novel ensemble mechanism for collective classification that reduces both learning and inference variance, by incorporating prediction averaging into the collective inference process itself. We show that our proposed method significantly outperforms a straightforward relational ensemble baseline on both synthetic and real-world datasets.

[1]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..

[2]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[3]  Yizhou Sun,et al.  Graph-based Consensus Maximization among Multiple Supervised and Unsupervised Models , 2009, NIPS.

[4]  Bernhard Schölkopf,et al.  Fast protein classification with multiple networks , 2005, ECCB/JBI.

[5]  Sofus A. Macskassy Improving Learning in Networked Data by Combining Explicit and Mined Links , 2007, AAAI.

[6]  Ben Taskar,et al.  Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning) , 2007 .

[7]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[8]  Jennifer Neville,et al.  A bias/variance decomposition for models using collective inference , 2008, Machine Learning.

[9]  David D. Jensen,et al.  Why Stacked Models Perform Effective Collective Classification , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[10]  Lars Schmidt-Thieme,et al.  Relational Ensemble Classification , 2006, Sixth International Conference on Data Mining (ICDM'06).

[11]  Jennifer Neville,et al.  Relational Dependency Networks , 2007, J. Mach. Learn. Res..

[12]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[13]  Christos Faloutsos,et al.  Using ghost edges for classification in sparsely labeled networks , 2008, KDD.

[14]  Ben Taskar,et al.  Multi-View Learning over Structured and Non-Identical Outputs , 2008, UAI.

[15]  Jennifer Neville,et al.  A Resampling Technique for Relational Data Graphs , 2008 .

[16]  Masashi Sugiyama,et al.  Integration of Multiple Networks for Robust Label Propagation , 2008, SDM.

[17]  Padraig Cunningham,et al.  Diversity versus Quality in Classification Ensembles Based on Feature Selection , 2000, ECML.

[18]  Andrew McCallum,et al.  Introduction to Statistical Relational Learning , 2007 .

[19]  William W. Cohen,et al.  Stacked Graphical Models for Efficient Inference in Markov Random Fields , 2007, SDM.

[20]  Jennifer Neville,et al.  Leveraging relational autocorrelation with latent group models , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).