Weakly supervised classification with bagging in fisheries acoustics

Statistical training allows the establishment of a probabilistic classification model. In the supervised case, the model is assessed from a labelled dataset, i.e. each observed data has a label. In the weakly-supervised case, the label is not exactly known. In our instance, the probability to associate the observation to the different classes is known. Thus, labels for the data are a probability vector. Methods developed in this paper are applied to object recognition in images. These images contain objects that must be classified according to their class membership. The ground truth is the knowledge of the relative proportion of classes in each labelled images. This global proportion leads to probability vector label for each training object. The originality of this paper consists in the association between weakly labelled data and several probabilistic discriminative models that are mixed using a bagging technique. Two classification models (Bayesian and discriminative) are compared on oceanographic data. The objective is to recognize the species of fish schools in acoustic images. The relative class proportion in labelled images is given by successive trawl catches. The results show that the discriminative model is more robust than the Bayesian model. The contribution of the bagging is shown for the discriminative model.

[1]  O. Chapelle,et al.  Semi-Supervised Learning (Chapelle, O. et al., Eds.; 2006) [Book reviews] , 2009, IEEE Transactions on Neural Networks.

[2]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[3]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[4]  Sotiris B. Kotsiantis,et al.  Combining Bagging, Boosting and Dagging for Classification Problems , 2007, KES.

[5]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[6]  Riwal Lefort,et al.  Weakly supervised learning using proportion-based information: An application to fisheries acoustics , 2008, 2008 19th International Conference on Pattern Recognition.

[7]  Panayiotis E. Pintelas,et al.  Combining Bagging and Boosting , 2007 .

[8]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[9]  Ilkay Ulusoy,et al.  Generative versus discriminative methods for object recognition , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[10]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[11]  Thierry Denoeux,et al.  Learning from partially supervised data using mixture models and belief functions , 2009, Pattern Recognit..

[12]  Jacques Masse,et al.  Acoustic detection of the spatial and temporal distribution of fish shoals in the Bay of Biscay , 1993 .

[13]  Peter J. Cameron,et al.  Rank three permutation groups with rank three subconstituents , 1985, J. Comb. Theory, Ser. B.

[14]  Pietro Perona,et al.  Unsupervised Learning of Models for Recognition , 2000, ECCV.

[15]  Alain Hillion,et al.  Narrowband acoustic identification of monospecific fish shoals , 1996 .

[16]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .