论文引用

Sheng Tang, Jintao Li, Gang Cao et al.,
2012

With the rapid growth of multimedia application technologies and network technologies, especially the proliferation of Web 2.0 and digital cameras, there has been an explosion of images and videos in ...

Ling-Yu Duan, Hanqing Lu, Qingshan Liu et al.,
2008,
IEEE Transactions on Multimedia

With the advance of digital video recording and playback systems, the request for efficiently managing recorded TV video programs is evident so that users can readily locate and browse their favorite ...

While accuracy and speed get a lot of attention in video retrieval research, the investigation of interactive retrieval tools gets less attention and is often regarded as trivial. We want to show that...

Rong Yan, Apostol Natsev, Lexing Xie et al.,
2007,
ACM Multimedia

We study the problem of semantic concept-based query expansion and re-ranking for multimedia retrieval. In particular, we explore the utility of a fixed lexicon of visual semantic concepts for automat...

Nicu Sebe, Bogdan Ionescu, Jasper R. R. Uijlings et al.,
2016,
Comput. Vis. Image Underst.

We proposed a novel framework for Relevance Feedback based on the Fisher Kernel.The Fisher Kernel representation makes possible to capture temporal variation by using frame-based features.We experimen...

Koichi Shinoda, Nakamasa Inoue,
2015,
ACM Multimedia

We propose vocabulary expansion for video semantic indexing. From many semantic concept detectors obtained by using training data, we make detectors for concepts not included in training data. First, ...

Shih-Fu Chang, Lyndon S. Kennedy, Shih-Fu Chang et al.,
2007,
CIVR '07

We propose to incorporate hundreds of pre-trained concept detectors to provide contextual information for improving the performance of multimodal video search. The approach takes initial search result...

Koichi Shinoda, Nakamasa Inoue,
2014,
ACM Multimedia

We propose n-gram modeling of shot sequences for video semantic indexing, in which semantic concepts are extracted from a video shot. Most previous studies for this task have assumed that video shots ...

Emmanuel Dellandréa, Liming Chen, Charles-Edmond Bichot et al.,
2012,
ECCV Workshops

We propose in this paper a novel multimodal approach to automatically predict the visual concepts of images through an effective fusion of visual and textual features. It relies on a Selective Weighte...

We propose a query-by-example method that can retrieve a variety of shots relevant to a query, but these shots contain significantly different features due to varied shooting techniques and settings. ...

Jenny Benois-Pineau, Georges Quénot, Tomas Piatrik et al.,
2014,
Advances in Computer Vision and Pattern Recognition

We propose a novel multimodal approach to automatically predict the visual concepts of images through an effective fusion of visual and textual features. It relies on a Selective Weighted Late Fusion ...

Sheng Tang, Yongdong Zhang, Jintao Li et al.,
2006,
PCM

We propose a novel method for extracting text feature from the automatic speech recognition (ASR) results in semantic video retrieval. We combine HowNet-rule-based knowledge with statistic information...

Julie Delon, Jean-Michel Morel, Mariano Rodríguez et al.,
2018,
SIAM J. Imaging Sci.

We propose a mathematical method to analyze the numerous algorithms performing Image Matching by Affine Simulation (IMAS). To become affine invariant they apply a discrete set of affine transforms to ...

Koichi Shinoda, Nakamasa Inoue, Nakamasa Inoue et al.,
2012,
IEEE Transactions on Multimedia

We propose a fast maximum a posteriori (MAP) adaptation method for video semantic indexing that uses Gaussian mixture model (GMM) supervectors. In this method, a tree-structured GMM is utilzed to decr...

Rong Yan, Wei-Hao Lin, Jun Yang et al.,
2006,
MM '06

We present an efficient system for video search that maximizes the use of human bandwidth, while at the same time exploiting the machine's ability to learn in real-time from user selected relevant vid...

We present a system that automatically tags videos, i.e. detects high-level semantic concepts like objects or actions in them. To do so, our system does not rely on datasets manually annotated for res...

We present a comprehensive review of the state of the art in video browsing and retrieval systems, with special emphasis on interfaces and applications. There has been a significant increase in activi...

Yueting Zhuang, Yanan Liu, Fei Wu et al.,
2006,
TRECVID

We participated in the high-level feature extraction and interactive-search task for TRECVID 2006. Interaction and integration of multi-modality media types such as visual, audio and textual data in v...

Zhang Wen, Yuxin Peng, Hongbo Sun et al.,
2013,
TRECVID

We participated in all two types of instance search (INS) task in TRECVID 2015: automatic search and interactive search. This paper presents our approaches and results. In this task, we mainly focused...

Meng Wang, Yi-Liang Zhao, Tat-Seng Chua et al.,
2014,
TOMCCAP

We often remember images and videos that we have seen or recorded before but cannot quite recall the exact venues or details of the contents. We typically have vague memories of the contents, which ca...