Probabilistic models for combining diverse knowledge sources in multimedia retrieval

In recent years, the multimedia retrieval community is gradually shifting its emphasis from analyzing one media source at a time to exploring the opportunities of combining diverse knowledge sources from correlated media types and context. In order to combine multimedia knowledge sources, two basic issues must be addressed: what to combine and how to combine. While considerable effort has been expended to generate a wide range of ranking features from knowledge sources, relatively less attention has been given to the problem of finding a suitable strategy to combine them. It has always been a significant challenge to develop principled combination approaches and capture useful factors such as query information and context information in the retrieval process. This thesis presents a conditional probabilistic retrieval model as a principled framework to combine diverse knowledge sources. This model can integrate multiple forms of ranking features (query dependent and query independent features) as well as query information and context information in a unified framework with a solid probabilistic foundation. Under this retrieval framework, we overview and develop a number of state-of-the-art approaches for extracting ranking features from multimedia knowledge sources. In order to deal with heterogenous features, a discriminative learning approach is suggested for estimating the combination parameters. Moreover, an efficient rank learning approach has been developed to explicitly model the ranking relations in the learning process with much less training time. To incorporate query information in the combination model, this thesis develops a number of query analysis models that can automatically discover mixing structure of the query space based on previous retrieval results, and predict combination parameters for unseen queries. In more detail, we propose the query-class based analysis model which needs to manually define the query classes and a series of probabilistic latent query analysis(pLQA) models which can automatically discover latent query classes from the development data by unifying the combination weight optimization and query class categorization into a discriminative learning framework. To adapt the combination function on a per query basis, this thesis also presents a probabilistic local context analysis(pLCA) model to automatically leverage additional retrieval sources to improve initial retrieval outputs. A pLCA variant is proposed to utilize human feedback to adjust combination parameters. All the proposed approaches are evaluated on multimedia retrieval tasks with large-scale video collections. Beyond multimedia collections, we also evaluate our approaches on meta-search tasks with large-scale text collections. Experi mental evaluations demonstrate the promising performance of the probabilistic retrieval framework with query analysis and context analysis in the task of knowledge source combination. The applicability of the proposed methods can be extended to many other areas, such as question answering, web IR, cross-lingual IR, multi-sensor fusion, human tracking, and so forth.

[1]  John R. Smith,et al.  On the detection of semantic concepts at TRECVID , 2004, MULTIMEDIA '04.

[2]  Thomas Hofmann,et al.  Latent Class Models for Collaborative Filtering , 1999, IJCAI.

[3]  Paul A. Viola,et al.  Boosting Image Retrieval , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[4]  Michael J. Swain,et al.  Color indexing , 1991, International Journal of Computer Vision.

[5]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[6]  Christos Faloutsos,et al.  Efficient and effective Querying by Image Content , 1994, Journal of Intelligent Information Systems.

[7]  Christos Faloutsos,et al.  MindReader: Querying Databases Through Multiple Examples , 1998, VLDB.

[8]  Ricky Houghton Named Faces: Putting Names to Faces , 1999, IEEE Intell. Syst..

[9]  Rong Yan,et al.  On predicting rare classes with SVM ensembles in scene classification , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[10]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[11]  Jean-Luc Gauvain,et al.  The LIMSI Broadcast News transcription system , 2002, Speech Commun..

[12]  W. Bruce Croft,et al.  TREC and Tipster Experiments with Inquery , 1995, Inf. Process. Manag..

[13]  T. John Stonham,et al.  Content-based image retrieval using color tuple histograms , 1996, Electronic Imaging.

[14]  W. Bruce Croft,et al.  Improving the effectiveness of information retrieval with local context analysis , 2000, TOIS.

[15]  Rong Yan,et al.  Multi-class active learning for video semantic feature extraction , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[16]  Beng Chin Ooi,et al.  Fast signature-based color-spatial image retrieval , 1997, Proceedings of IEEE International Conference on Multimedia Computing and Systems.

[17]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[18]  Mohan S. Kankanhalli,et al.  Shape Measures for Content Based Image Retrieval: A Comparison , 1997, Inf. Process. Manag..

[19]  Jun Wu,et al.  Tsinghua University at TRECVID 2004: Shot Boundary Detection and High-Level Feature Extraction , 2004, TRECVID.

[20]  Jianfeng Gao,et al.  Linear discriminant model for information retrieval , 2005, SIGIR '05.

[21]  Paul Smolensky,et al.  Information processing in dynamical systems: foundations of harmony theory , 1986 .

[22]  John R. Smith,et al.  Active learning for simultaneous annotation of multiple binary semantic concepts [video content analysis] , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[23]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[24]  Rong Yan,et al.  Co-training non-robust classifiers for video semantic concept detection , 2005, IEEE International Conference on Image Processing 2005.

[25]  Edward A. Fox,et al.  Combination of Multiple Searches , 1993, TREC.

[26]  Jun Yang,et al.  Finding Person X: Correlating Names with Visual Appearances , 2004, CIVR.

[27]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Vipin Kumar,et al.  Predicting rare classes: can boosting make any weak learner strong? , 2002, KDD.

[29]  Geoffrey E. Hinton,et al.  Exponential Family Harmoniums with an Application to Information Retrieval , 2004, NIPS.

[30]  Mei-Yuh Hwang,et al.  The SPHINX-II speech recognition system: an overview , 1993, Comput. Speech Lang..

[31]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[32]  Luc Van Gool,et al.  Content-Based Image Retrieval Based on Local Affinely Invariant Regions , 1999, VISUAL.

[33]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[34]  Ronald Rosenfeld,et al.  Statistical language modeling using the CMU-cambridge toolkit , 1997, EUROSPEECH.

[35]  Alexander G. Hauptmann,et al.  Adjustable filmstrips and skims as abstractions for a digital video library , 1999, Proceedings IEEE Forum on Research and Technology Advances in Digital Libraries.

[36]  Luo Si,et al.  Using sampled data and regression to merge search engine results , 2002, SIGIR '02.

[37]  John R. Smith,et al.  Active selection for multi-example querying by content , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[38]  Akio Nagasaka,et al.  Automatic Video Indexing and Full-Video Search for Object Appearances , 1991, VDB.

[39]  B. Huurnink Autoseek towards a Fully Automated Video Search System Acknowledgements , 2005 .

[40]  Stephen E. Robertson,et al.  Relevance weighting of search terms , 1976, J. Am. Soc. Inf. Sci..

[41]  Michael G. Christel,et al.  Interactive Maps for a Digital Video Library , 2000, IEEE Multim..

[42]  R. Manmatha,et al.  Using Models of Score Distributions in Information Retrieval , 2001 .

[43]  Ellen M. Voorhees,et al.  Learning collection fusion strategies , 1995, SIGIR '95.

[44]  Rong Jin,et al.  Using a probabilistic source model for comparing images , 2002, Proceedings. International Conference on Image Processing.

[45]  John R. Smith,et al.  VideoAnnEx: IBM MPEG-7 Annotation Tool for Multimedia Indexing and Concept Learning , 2003 .

[46]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[47]  Beng Chin Ooi,et al.  Efficient Image Retrieval By Color Contents , 1994, ADB.

[48]  Alan F. Smeaton,et al.  TRECVID 2004 Experiments in Dublin City University , 2004, TRECVID.

[49]  Jukka Kortelainen,et al.  TRECVID 2004 Experiments at MediaTeam Oulu , 2004, TRECVID.

[50]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[51]  Mark Sanderson,et al.  MetaSearch : Data Fusion and Distributed Retrieval , 2002 .

[52]  Moni Naor,et al.  Rank aggregation methods for the Web , 2001, WWW '01.

[53]  Moni Naor,et al.  Optimal aggregation algorithms for middleware , 2001, PODS '01.

[54]  Nello Cristianini,et al.  Query Learning with Large Margin Classi ersColin , 2000 .

[55]  Donna K. Harman,et al.  Overview of the Eighth Text REtrieval Conference (TREC-8) , 1999, TREC.

[56]  Paul Over,et al.  TRECVID: evaluating the effectiveness of information retrieval tasks on digital video , 2004, MULTIMEDIA '04.

[57]  Apostol Natsev,et al.  Exploring Automatic Query Refinement for Text-Based Video Retrieval , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[58]  Rong Yan,et al.  Multimedia Search with Pseudo-relevance Feedback , 2003, CIVR.

[59]  Kim L. Boyer,et al.  Quantitative measures of change based on feature organization: eigenvalues and eigenvectors , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[60]  Serge J. Belongie,et al.  Region-based image querying , 1997, 1997 Proceedings IEEE Workshop on Content-Based Access of Image and Video Libraries.

[61]  Dan I. Moldovan,et al.  LCC at TRECVID 2005 , 2005, TRECVID.

[62]  John R. Smith,et al.  IBM Research TRECVID-2009 Video Retrieval System , 2009, TRECVID.

[63]  Norbert Fuhr,et al.  Probabilistic Models in Information Retrieval , 1992, Comput. J..

[64]  Foster Provost,et al.  Machine Learning from Imbalanced Data Sets 101 , 2008 .

[65]  Wessel Kraaij,et al.  Variations on language modeling for information retrieval , 2005, SIGF.

[66]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[67]  Garrison W. Cottrell,et al.  Fusion Via a Linear Combination of Scores , 1999, Information Retrieval.

[68]  Shih-Fu Chang,et al.  Automated binary texture feature sets for image retrieval , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[69]  Alan F. Smeaton,et al.  A Comparison of Score, Rank and Probability-Based Fusion Methods for Video Shot Retrieval , 2005, CIVR.

[70]  Markus A. Stricker,et al.  Similarity of color images , 1995, Electronic Imaging.

[71]  Martin Szummer,et al.  Indoor-outdoor image classification , 1998, Proceedings 1998 IEEE International Workshop on Content-Based Access of Image and Video Database.

[72]  Sanjeev Khudanpur,et al.  TRECVID 2005 Experiment at Johns Hopkins University: Using Hidden Markov Models for Video Retrieval , 2005, TRECVID.

[73]  Foster Provost,et al.  The effect of class distribution on classifier learning , 2001 .

[74]  Wei-Hao Lin,et al.  Merging rank lists from multiple sources in video classification , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[75]  Ramesh C. Jain,et al.  ACM SIGMM retreat report on future directions in multimedia research , 2005, TOMCCAP.

[76]  Thomas S. Huang,et al.  Comparing discriminating transformations and SVM for learning during multimedia retrieval , 2001, MULTIMEDIA '01.

[77]  Ronald Fagin,et al.  Efficient similarity search and classification via rank aggregation , 2003, SIGMOD '03.

[78]  G. Wahba,et al.  Some results on Tchebycheffian spline functions , 1971 .

[79]  Roland Kuhn,et al.  Rapid speaker adaptation in eigenvoice space , 2000, IEEE Trans. Speech Audio Process..

[80]  Alexander G. Hauptmann,et al.  Headline Generation using a Training Corpus , 2001 .

[81]  Claire Cardie,et al.  Limitations of Co-Training for Natural Language Learning from Large Datasets , 2001, EMNLP.

[82]  Christos Faloutsos,et al.  GCap: Graph-based Automatic Image Captioning , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[83]  Jake K. Aggarwal,et al.  Image segmentation by conventional and information-integrating techniques: a synopsis , 1985, Image Vis. Comput..

[84]  Cordelia Schmid,et al.  Local Grayvalue Invariants for Image Retrieval , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[85]  Mitchell P. Marcus,et al.  Text Chunking using Transformation-Based Learning , 1995, VLC@ACL.

[86]  Tai Sing Lee,et al.  Image Representation Using 2D Gabor Wavelets , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[87]  Shih-Fu Chang,et al.  Discovering meaningful multimedia patterns with audio-visual concepts and associated text , 2004, 2004 International Conference on Image Processing, 2004. ICIP '04..

[88]  Rainer Lienhart,et al.  VIDEO OCR: A SURVEY AND PRACTITIONER'S GUIDE , 2003 .

[89]  Vapnik,et al.  SVMs for Histogram Based Image Classification , 1999 .

[90]  In-Ho Kang,et al.  Query type classification for web document retrieval , 2003, SIGIR.

[91]  Wei-Ying Ma,et al.  Learning and inferring a semantic space from user's relevance feedback for image retrieval , 2002, MULTIMEDIA '02.

[92]  D Marr,et al.  Theory of edge detection , 1979, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[93]  Joachim M. Buhmann,et al.  Non-parametric similarity measures for unsupervised texture segmentation and image retrieval , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[94]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[95]  N. Japkowicz Learning from Imbalanced Data Sets: A Comparison of Various Strategies * , 2000 .

[96]  Javed A. Aslam,et al.  Relevance score normalization for metasearch , 2001, CIKM '01.

[97]  Sankar K. Pal,et al.  A review on image segmentation techniques , 1993, Pattern Recognit..

[98]  Ramin Zabih,et al.  Comparing images using joint histograms , 1999, Multimedia Systems.

[99]  Anil K. Jain,et al.  Bayesian framework for semantic classification of outdoor vacation images , 1998, Electronic Imaging.

[100]  Jing Huang,et al.  Image indexing using color correlograms , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[101]  Rong Jin,et al.  Title Generation Using a Training Corpus , 2001, CICLing.

[102]  Gang Wang,et al.  TRECVID 2004 Search and Feature Extraction Task by NUS PRIS , 2004, TRECVID.

[103]  Thomas S. Huang,et al.  Content-based image retrieval with relevance feedback in MARS , 1997, Proceedings of International Conference on Image Processing.

[104]  Chong-Wah Ngo,et al.  On clustering and retrieval of video shots , 2001, MULTIMEDIA '01.

[105]  John Adcock,et al.  FXPAL Experiments for TRECVID 2004 , 2004, TRECVID.

[106]  Ralf Herbrich,et al.  Large margin rank boundaries for ordinal regression , 2000 .

[107]  Alberto Del Bimbo,et al.  Visual Image Retrieval by Elastic Matching of User Sketches , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[108]  Howard D. Wactlar,et al.  Putting active learning into multimedia applications: dynamic definition and refinement of concept classifiers , 2005, MULTIMEDIA '05.

[109]  Ramesh Nallapati,et al.  Discriminative models for information retrieval , 2004, SIGIR '04.

[110]  Craig A. Knoblock,et al.  Active + Semi-supervised Learning = Robust Multi-View Learning , 2002, ICML.

[111]  Stephen E. Robertson,et al.  Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval , 1994, SIGIR '94.

[112]  Stan Z. Li,et al.  Extraction of feature subspaces for content-based retrieval using relevance feedback , 2001, MULTIMEDIA '01.

[113]  Dell Zhang,et al.  Question classification using support vector machines , 2003, SIGIR.

[114]  R. Manmatha,et al.  Automatic image annotation and retrieval using cross-media relevance models , 2003, SIGIR.

[115]  Javed A. Aslam,et al.  Models for metasearch , 2001, SIGIR '01.

[116]  Rong Yan,et al.  Semi-supervised cross feature learning for semantic concept detection in videos , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[117]  Ronald Fagin,et al.  Fuzzy queries in multimedia database systems , 1998, PODS '98.

[119]  Rohini K. Srihari,et al.  Multimedia indexing and retrieval , 1998, SIGF.

[120]  Rong Yan,et al.  The combination limit in multimedia retrieval , 2003, MULTIMEDIA '03.

[121]  Richard C. Dubes,et al.  Performance evaluation for four classes of textural features , 1992, Pattern Recognit..

[122]  Tat-Seng Chua,et al.  TRECVID 2005 by NUS PRIS , 2005, TRECVID.

[123]  Eric Brill,et al.  Some Advances in Transformation-Based Part of Speech Tagging , 1994, AAAI.

[124]  Shih-Fu Chang,et al.  Combining text and audio-visual features in video indexing , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[125]  Iadh Ounis,et al.  Inferring Query Performance Using Pre-retrieval Predictors , 2004, SPIRE.

[126]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[127]  Milind R. Naphade,et al.  Probabilistic Semantic Video Indexing , 2000, NIPS.

[128]  Pinar Duygulu Sahin,et al.  Joint visual-text modeling for automatic retrieval of multimedia documents , 2005, ACM Multimedia.

[129]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[130]  David B. Cooper,et al.  Object signature curve and invariant shape patches for geometric indexing into pictorial databases , 1997, Other Conferences.

[131]  Alexander G. Hauptmann,et al.  Topic labeling of broadcast news stories in the informedia digital video library , 1998, DL '98.

[132]  Jacques Savoy,et al.  Database merging strategy based on logistic regression , 2000, Inf. Process. Manag..

[133]  K. Wakimoto,et al.  Efficient and Effective Querying by Image Content , 1994 .

[134]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[135]  Alberto Del Bimbo,et al.  Visual information retrieval , 1999 .

[136]  Rong Yan,et al.  Efficient Margin-Based Rank Learning Algorithms for Information Retrieval , 2006, CIVR.

[137]  Markus A. Stricker Bounds for the discrimination power of color indexing techniques , 1994, Electronic Imaging.

[138]  Koby Crammer,et al.  Pranking with Ranking , 2001, NIPS.

[139]  Anil K. Jain,et al.  On image classification: city vs. landscape , 1998, Proceedings. IEEE Workshop on Content-Based Access of Image and Video Libraries (Cat. No.98EX173).

[140]  Joan Serra,et al.  Image segmentation , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[141]  Rong Yan,et al.  Mining Associated Text and Images with Dual-Wing Harmoniums , 2005, UAI.

[142]  Thorsten Joachims,et al.  Making large-scale support vector machine learning practical , 1999 .

[143]  John R. Smith,et al.  Large-scale concept ontology for multimedia , 2006, IEEE MultiMedia.

[144]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[145]  John P. Oakley,et al.  Storage and Retrieval for Image and Video Databases , 1993 .

[146]  Wei-Hao Lin,et al.  News video classification using SVM-based multimodal classifiers and combination strategies , 2002, MULTIMEDIA '02.

[147]  Wei-Hao Lin,et al.  Confounded Expectations: Informedia at TRECVID 2004 , 2004, TRECVID.

[148]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[149]  Jingrui He,et al.  Manifold-ranking based image retrieval , 2004, MULTIMEDIA '04.

[150]  Jon Kleinberg,et al.  Authoritative sources in a hyperlinked environment , 1999, SODA '98.

[151]  Dong Xu,et al.  Columbia University TRECVID-2006 Video Search and High-Level Feature Extraction , 2006, TRECVID.

[152]  Ben Taskar,et al.  Learning on the Test Data: Leveraging Unseen Features , 2003, ICML.

[153]  James P. Callan,et al.  Combining document representations for known-item search , 2003, SIGIR.

[154]  Shih-Fu Chang,et al.  Tools and techniques for color image retrieval , 1996, Electronic Imaging.

[155]  Timo Ojala,et al.  Cluster-temporal browsing of large news video databases , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[156]  W. Bruce Croft,et al.  Using Probabilistic Models of Document Retrieval without Relevance Information , 1979, J. Documentation.

[157]  Fredric C. Gey,et al.  Inferring probability of relevance using the method of logistic regression , 1994, SIGIR '94.

[158]  Richard M. Schwartz,et al.  Nymble: a High-Performance Learning Name-finder , 1997, ANLP.

[159]  Thijs Westerveld,et al.  Multimedia Retrieval Using Multiple Examples , 2004, CIVR.

[160]  Michael I. Jordan,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1994, Neural Computation.

[161]  Xiaochun Cao,et al.  Video Understanding and Content-Based Retrieval , 2005, TRECVID.

[162]  Shinichi Morishita,et al.  Rank Aggregation Method for Biological Databases , 2001 .

[163]  Tobun Dorbin Ng,et al.  Informedia at TRECVID 2003 : Analyzing and Searching Broadcast News Video , 2003, TRECVID.

[164]  Ralph Roskies,et al.  Fourier Descriptors for Plane Closed Curves , 1972, IEEE Transactions on Computers.

[165]  Shih-Fu Chang,et al.  MetaSEEk: a content-based metasearch engine for images , 1997, Electronic Imaging.

[166]  Mounia Lalmas,et al.  Video retrieval using an MPEG-7 based inference network , 2002, SIGIR '02.

[167]  Edward Y. Chang,et al.  Optimal multimodal fusion for multimedia data analysis , 2004, MULTIMEDIA '04.

[168]  Marcel Worring,et al.  The MediaMill TRECVID 2004 Semantic Viedo Search Engine , 2004, TRECVID.

[169]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[170]  Jiebo Luo,et al.  A computationally efficient approach to indoor/outdoor scene classification , 2002, Object recognition supported by user interaction for service robots.

[171]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[172]  Mark T. Maybury,et al.  Broadcast news navigation using story segmentation , 1997, MULTIMEDIA '97.

[173]  John R. Smith Video indexing and retrieval using MPEG-7 , 2002, SPIE ITCom.

[174]  Christopher J. C. Burges,et al.  A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.

[175]  Jin Zhao,et al.  Video Retrieval Using High Level Features: Exploiting Query Matching and Confidence-Based Weighting , 2006, CIVR.

[176]  Chris Buckley,et al.  SMART in TREC 8 , 1999, Text Retrieval Conference.

[177]  Shih-Fu Chang,et al.  Automatic discovery of query-class-dependent models for multimodal search , 2005, MULTIMEDIA '05.

[178]  Salima Benbernou,et al.  Semantic retrieval of multimedia data , 2004, MMDB '04.

[179]  Ramesh C. Jain,et al.  A survey on the use of pattern recognition methods for abstraction, indexing and retrieval of images and video , 2002, Pattern Recognit..

[180]  John D. Lafferty,et al.  A Robust Parsing Algorithm for Link Grammars , 1995, IWPT.

[181]  Ryen W. White,et al.  An implicit feedback approach for interactive information retrieval , 2006, Inf. Process. Manag..

[182]  Yiming Yang,et al.  Topic-conditioned novelty detection , 2002, KDD.

[183]  C.-C. Jay Kuo,et al.  Wavelet descriptor of planar curves: theory and applications , 1996, IEEE Trans. Image Process..

[184]  Michael I. Jordan,et al.  Modeling annotated data , 2003, SIGIR.

[185]  Edward Y. Chang,et al.  Support vector machine active learning for image retrieval , 2001, MULTIMEDIA '01.

[186]  Michael Collins,et al.  Discriminative Reranking for Natural Language Parsing , 2000, CL.

[187]  S. Robertson The probability ranking principle in IR , 1997 .

[188]  Hans-Peter Frei,et al.  Concept based query expansion , 1993, SIGIR.

[189]  K. S. Thyagarajan,et al.  A maximum likelihood approach to texture classification using wavelet transform , 1994, Proceedings of 1st International Conference on Image Processing.

[190]  Arnold Neumaier,et al.  Global Optimization by Multilevel Coordinate Search , 1999, J. Glob. Optim..

[191]  David C. Gibbon,et al.  Relevance Feedback using Support Vector Machines , 2001, ICML.

[192]  Umberto Straccia,et al.  Web metasearch: rank vs. score based rank aggregation methods , 2003, SAC '03.

[193]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[194]  Yoram Singer,et al.  Learning to Order Things , 1997, NIPS.

[195]  Yi Wu,et al.  Ontology-based multi-classification learning for video concept detection , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[196]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[197]  Omar Javed,et al.  University of Central Florida at TRECVID 2004 , 2003, TRECVID.

[198]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[199]  Amit Singhal,et al.  Pivoted document length normalization , 1996, SIGIR 1996.

[200]  Grace Hui Yang,et al.  VideoQA: question answering on news video , 2003, MULTIMEDIA '03.

[201]  Songde Ma,et al.  On the relation between region and contour representation , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[202]  Dan Roth,et al.  Learning Question Classifiers , 2002, COLING.

[203]  B. S. Manjunath,et al.  A comparison of wavelet transform features for texture image annotation , 1995, Proceedings., International Conference on Image Processing.

[204]  Jitendra Malik,et al.  Normalized Cuts and Image Segmentation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[205]  Jitendra Malik,et al.  Motion segmentation and tracking using normalized cuts , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[206]  Rayid Ghani,et al.  Analyzing the effectiveness and applicability of co-training , 2000, CIKM '00.

[207]  ChengXiang Zhai,et al.  Probabilistic Relevance Models Based on Document and Query Generation , 2003 .

[208]  Paul Over,et al.  TRECVID: Benchmarking the Effectivenss of Information Retrieval Tasks on Digital Video , 2003, CIVR.

[209]  Alan F. Smeaton,et al.  Design, implementation and testing of an interactive video retrieval system , 2003, MIR '03.

[210]  James Ze Wang,et al.  IRM: integrated region matching for image retrieval , 2000, ACM Multimedia.

[211]  David A. Forsyth,et al.  Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[212]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[213]  Alexander G. Hauptmann,et al.  Successful approaches in the TREC video retrieval evaluations , 2004, MULTIMEDIA '04.

[214]  Paul A. Viola,et al.  Unsupervised improvement of visual detectors using cotraining , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[215]  C. J. van Rijsbergen,et al.  Information Retrieval , 1979, Encyclopedia of GIS.

[216]  Chrisa Tsinaraki,et al.  An Ontology-Driven Framework for the Management of Semantic Metadata Describing Audiovisual Information , 2003, CAiSE.

[217]  Rong Yan,et al.  Learning query-class dependent weights in automatic video retrieval , 2004, MULTIMEDIA '04.

[218]  Yiming Yang,et al.  Translingual Information Retrieval: A Comparative Evaluation , 1997, IJCAI.

[219]  Ellen K. Hughes,et al.  Video OCR for digital news archive , 1998, Proceedings 1998 IEEE International Workshop on Content-Based Access of Image and Video Database.

[220]  Jan C. van Gemert,et al.  Retrieving Images as Text , 2003 .

[221]  Shih-Fu Chang,et al.  VisualSEEk: a fully automated content-based image query system , 1997, MULTIMEDIA '96.

[222]  Tobun Dorbin Ng,et al.  Video retrieval using speech and image information , 2003, IS&T/SPIE Electronic Imaging.

[223]  Ophir Frieder,et al.  Surrogate scoring for improved metasearch precision , 2005, SIGIR '05.

[224]  Salvatore J. Stolfo,et al.  Toward Scalable Learning with Non-Uniform Class and Cost Distributions: A Case Study in Credit Card Fraud Detection , 1998, KDD.

[225]  Thomas Hofmann,et al.  Probabilistic latent semantic indexing , 1999, SIGIR '99.

[226]  Rong Yan,et al.  Probabilistic latent query analysis for combining multiple retrieval sources , 2006, SIGIR.

[227]  Yoram Singer,et al.  Unsupervised Models for Named Entity Classification , 1999, EMNLP.

[228]  Pietro Perona,et al.  A Factorization Approach to Grouping , 1998, ECCV.

[229]  Ingemar J. Cox,et al.  "Ratio regions": a technique for image segmentation , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[230]  Elad Yom-Tov,et al.  Learning to estimate query difficulty: including applications to missing content detection and distributed information retrieval , 2005, SIGIR '05.

[231]  Peter J. Rousseeuw,et al.  Robust regression and outlier detection , 1987 .

[232]  Thomas S. Huang,et al.  Modified Fourier Descriptors for Shape Representation - A Practical Approach , 1996 .

[233]  Jean-Marc Odobez,et al.  Video text recognition using sequential Monte Carlo and error voting methods , 2005, Pattern Recognit. Lett..

[234]  Yihong Gong,et al.  Lessons Learned from Building a Terabyte Digital Video Library , 1999, Computer.

[235]  Alexander G. Hauptmann,et al.  Informedia: news-on-demand multimedia information acquisition and retrieval , 1997 .

[236]  Alexander G. Hauptmann,et al.  The Use and Utility of High-Level Semantic Features in Video Retrieval , 2005, CIVR.

[237]  Xian-Sheng Hua,et al.  Automatic location of text in video frames , 2001, MULTIMEDIA '01.

[238]  B. S. Manjunath,et al.  Texture Features for Browsing and Retrieval of Image Data , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[239]  Michael G. Christel,et al.  Information Visualization Within a Digital Video Library , 1998, Journal of Intelligent Information Systems.

[240]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[241]  Alexander G. Hauptmann,et al.  Video-cuebik: adapting image search to video shots , 2002, JCDL '02.

[242]  Carsten Peterson,et al.  A Mean Field Theory Learning Algorithm for Neural Networks , 1987, Complex Syst..

[243]  Ellen K. Hughes,et al.  Video OCR for Digital News Archives , 1998 .

[244]  Marcel Worring,et al.  The challenge problem for automated detection of 101 semantic concepts in multimedia , 2006, MM '06.

[245]  Thijs Westerveld,et al.  Using generative probabilistic models for multimedia retrieval , 2005, SIGF.

[246]  Gerald Salton,et al.  Automatic text processing , 1988 .

[247]  Luo Si,et al.  Effective automatic image annotation via a coherent language model and active learning , 2004, MULTIMEDIA '04.

[248]  Brendan J. Frey,et al.  Probabilistic multimedia objects (multijects): a novel approach to video indexing and retrieval in multimedia systems , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).