Multiple instance learning with generalized support vector machines

In pattern classification it is usually assumed that a training set of labeled patterns is available. Multiple-Instance Learning (MIL) generalizes this problem setting by making weaker assumptions about the labeling information. While each pattern is still believed to possess a true label, training labels are associated with sets or bags of patterns rather than individual patterns. More formally, given is a set of patterns x1, ...,xn grouped into bags X1, ..., Xm, with Xj = {xi : i ∈ Ij} and Ij ⊆ {1, ..., n}. With each bag Xj is associated a label Yj ∈ {−1, 1}. These labels are interpreted in the following way: if a bag has a negative label Yj = −1, all patterns in that bag inherit the negative label. If on the other hand, Yj = 1, then at least one pattern xi ∈ Xj is a positive example of the underlying concept. The MIL scenario has many interesting applications: One prominent application is the classification of molecules in the context of drug design (Dietterich, Lathrop, & LozanoPerez 1997). Here, each molecule is represented by a bag of possible conformations. Another application is in image retrieval where images can be viewed as bags of local image patches (Maron & Ratan 1998) or image regions. Algorithms for the MIL problem were first presented in (Dietterich, Lathrop, & Lozano-Perez 1997; Auer 1997; Long & Tan 1996). These methods (and analytical results) are based on hypothesis classes consisting of axisaligned rectangles. Similarly, methods developed subsequently (e.g., (Maron & Lozano-Perez 1998; Zhang & Goldman 2002)) have focused on specially tailored machine learning algorithms that do not compare favorably in the limiting case of bags of size 1 (the standard classification setting). A notable exception is (Ramon & Raedt 2000).