Unsupervised Ensemble Classification With Correlated Decision Agents

Decision-making procedures, when a set of individual binary labels is processed to produce a unique joint decision, can be approached modeling the individual labels as multivariate independent Bernoulli random variables. This probabilistic model allows an unsupervised solution using EM-based algorithms, which basically estimate the distribution model parameters and take a joint decision using a maximum a posteriori criterion. These methods usually assume that individual decision agents are conditionally independent, an assumption that might not hold in practical setups. Therefore, in this work we formulate and solve the decision-making problem using an EM-based approach, but assuming correlated decision agents. Improved performance is obtained on synthetic and real datasets, compared to classical and state-of-the-art algorithms.

[1]  Yuval Kluger,et al.  Estimating the accuracies of multiple classifiers without labeled data , 2014, AISTATS.

[2]  Hongwei Li,et al.  Error Rate Bounds and Iterative Weighted Majority Voting for Crowdsourcing , 2014, ArXiv.

[3]  G. Wahba,et al.  Multivariate Bernoulli distribution , 2012, 1206.1874.

[4]  Yuval Kluger,et al.  Ranking and combining multiple predictors without labeled data , 2013, Proceedings of the National Academy of Sciences.

[5]  Margarita Cabrera-Bean,et al.  Impact of noisy annotators' reliability in a crowdsourcing system performance , 2016, 2016 24th European Signal Processing Conference (EUSIPCO).

[6]  Jernej Tonejc,et al.  Pattern Recognition in Collective Cognitive Systems: Hybrid Human-Machine Learning (HHML) By Heterogeneous Ensembles , 2010, IC-AI.

[7]  Pranjal Awasthi,et al.  Crowdsourcing with Arbitrary Adversaries , 2018, ICML.

[8]  Xi Chen,et al.  Spectral Methods Meet EM: A Provably Optimal Algorithm for Crowdsourcing , 2014, J. Mach. Learn. Res..

[9]  Margarita Cabrera-Bean,et al.  Unsupervised online clustering and detection algorithms using crowdsourced data for malaria diagnosis , 2019, Pattern Recognit..

[10]  Vivek K. Goyal,et al.  Distributed Hypothesis Testing With Social Learning and Symmetric Fusion , 2014, IEEE Transactions on Signal Processing.

[11]  Yuval Kluger,et al.  Unsupervised Ensemble Learning with Dependent Classifiers , 2015, AISTATS.

[12]  Y. Kluger,et al.  Picking ChIP-seq peak detectors for analyzing chromatin modification experiments , 2012, Nucleic acids research.

[13]  Georgios B. Giannakis,et al.  Blind Multiclass Ensemble Classification , 2017, IEEE Transactions on Signal Processing.

[14]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[15]  Muhammad Usman,et al.  Sensor network-based spectrum sensing for cognitive radio network , 2016, 2016 International Conference on Intelligent Systems Engineering (ICISE).

[16]  Vassilis Anastassopoulos,et al.  Morphological waveform coding for writer identification , 2000, Pattern Recognit..

[17]  Rubiane M. Pires,et al.  Correlated binomial regression models , 2012, Comput. Stat. Data Anal..

[18]  A. P. Dawid,et al.  Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .

[19]  Gerardo Hermosillo,et al.  Learning From Crowds , 2010, J. Mach. Learn. Res..