Learning the semantics of words and pictures

We present a statistical model for organizing image collections which integrates semantic information provided by associate text and visual information provided by image features. The model is very promising for information retrieval tasks such as database browsing and searching for images based on text and/or image features. Furthermore, since the model learns relationships between text and image features, it can be used for novel applications such as associating words with pictures, and unsupervised learning for object recognition.

[1]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[2]  Rohini K. Srihari Extracting visual information from text: using captions to label faces in newspaper photographs , 1992 .

[3]  V. Govindaraju A computational theory for locating human faces in photographs , 1992 .

[4]  Venu Govindaraju,et al.  Use of Collateral Text in Image Interpretation , 1994 .

[5]  Debra T. Burhans,et al.  Visual Semantics: Extracting Visual information from Text Accompanying Pictures , 1994, AAAI.

[6]  Myron Flickner,et al.  Query by Image and Video Content , 1995 .

[7]  Rohini K. Srihari,et al.  Control Structures for Incorporating Picture-Specific Context in Image Interpretation , 1995, IJCAI.

[8]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[9]  Peter G. B. Enser,et al.  Progress in Documentation Pictorial Information Retrieval , 1995, J. Documentation.

[10]  Michael J. Swain,et al.  WebSeer: An Image Search Engine for the World Wide Web , 1996 .

[11]  David A. Forsyth,et al.  Finding Naked People , 1996, ECCV.

[12]  Peter G. B. Enser,et al.  Analysis of user need in image archives , 1997, J. Inf. Sci..

[13]  James Ze Wang,et al.  Content-based image indexing and searching using Daubechies' wavelets , 1998, International Journal on Digital Libraries.

[14]  Thomas Hofmann,et al.  Learning and representing topic-a hierarchical mixture model for word occurences in document databas , 1998 .

[15]  S. Sclaroff,et al.  Combining textual and visual cues for content-based image retrieval on the World Wide Web , 1998, Proceedings. IEEE Workshop on Content-Based Access of Image and Video Libraries (Cat. No.98EX173).

[16]  Leonidas J. Guibas,et al.  A metric for distributions with applications to image databases , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[17]  John Lafferty,et al.  Information retrieval as statistical translation , 1999, SIGIR 1999.

[18]  Edward M. Riseman,et al.  Indexing Flower Patent Images Using Domain Knowledge , 1999, IEEE Intell. Syst..

[19]  Fran ine Chena,et al.  Multi-Modal Browsing of Images in Web Do uments , 1999 .

[20]  Ingemar J. Cox,et al.  Correction to "the Bayesian image retrieval system, pichunter: theory, implementation, and psychophysical experiments" , 2000, IEEE Transactions on Image Processing.

[21]  Ingemar J. Cox,et al.  The Bayesian image retrieval system, PicHunter: theory, implementation, and psychophysical experiments , 2000, IEEE Trans. Image Process..

[22]  Gio Wiederhold,et al.  Semantics-sensitive integrated matching for picture libraries and biomedical image databases , 2000 .

[23]  Michael S. Lew Next-Generation Web Searches for Visual Content , 2000, Computer.

[24]  James Z. Wang SIMPLIcity: a region-based retrieval system for picture libraries and biomedical image databases , 2000, MM 2000.

[25]  Jitendra Malik,et al.  Blobworld: Image Segmentation Using Expectation-Maximization and Its Application to Image Querying , 2002, IEEE Trans. Pattern Anal. Mach. Intell..