Sampling Strategies for Active Learning in Personal Photo Retrieval

With the advent and proliferation of digital cameras and computers, the number of digital photos created and stored by consumers has grown extremely large. This created increasing demand for image retrieval systems to ease interaction between consumers and personal media content. Active learning is a widely used user interaction model for retrieval systems, which learns the query concept by asking users to label a number of images at each iteration. In this paper, we study sampling strategies for active learning in personal photo retrieval. In order to reduce human annotation efforts in a content-based image retrieval setting, we propose using multiple sampling criteria for active learning: informativeness, diversity and representativeness. Our experimental results show that by combining multiple sampling criteria in active learning, the performance of personal photo retrieval system can be significantly improved