Active learning in very large databases

Query-by-example and query-by-keyword both suffer from the problem of “aliasing,” meaning that example-images and keywords potentially have variable interpretations or multiple semantics. For discerning which semantic is appropriate for a given query, we have established that combining active learning with kernel methods is a very effective approach. In this work, we first examine active-learning strategies, and then focus on addressing the challenges of two scalability issues: scalability in concept complexity and in dataset size. We present remedies, explain limitations, and discuss future directions that research might take.

[1]  Edward Y. Chang,et al.  Clustering for Approximate Similarity Search in High-Dimensional Spaces , 2002, IEEE Trans. Knowl. Data Eng..

[2]  Daphne Koller,et al.  Support Vector Machine Active Learning with Application sto Text Classification , 2000, ICML.

[3]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[4]  Gautam Biswas,et al.  Unsupervised Learning with Mixed Numeric and Nominal Data , 2002, IEEE Trans. Knowl. Data Eng..

[5]  Edward Y. Chang,et al.  MEGA---the maximizing expected generalization algorithm for learning complex query concepts , 2003, TOIS.

[6]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[7]  Klaus Brinker,et al.  Incorporating Diversity in Active Learning with Support Vector Machines , 2003, ICML.

[8]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[9]  Edward Y. Chang,et al.  Statistical learning for effective visual information retrieval , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[10]  Edward Y. Chang,et al.  Multimodal concept-dependent active learning for image retrieval , 2004, MULTIMEDIA '04.

[11]  Edward Y. Chang,et al.  Exploiting Geometry for Support Vector Machine Indexing , 2005, SDM.

[12]  Edward Y. Chang,et al.  CBSA: content-based soft annotation for multimodal image retrieval using Bayes point machines , 2003, IEEE Trans. Circuits Syst. Video Technol..

[13]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[14]  Edward Y. Chang,et al.  Support vector machine active learning for image retrieval , 2001, MULTIMEDIA '01.

[15]  Edward Y. Chang,et al.  Discovery of a perceptual distance function for measuring image similarity , 2003, Multimedia Systems.