On the Selection of Anchors and Targets for Video Hyperlinking

A problem not well understood in video hyperlinking is what qualifies a fragment as an anchor or target. Ideally, anchors provide good starting points for navigation, and targets supplement anchors with additional details while not distracting users with irrelevant, false and redundant information. The problem is not trivial for intertwining relationship between data characteristics and user expectation. Imagine that in a large dataset, there are clusters of fragments spreading over the feature space. The nature of each cluster can be described by its size (implying popularity) and structure (implying complexity). A principle way of hyperlinking can be carried out by picking centers of clusters as anchors and from there reach out to targets within or outside of clusters with consideration of neighborhood complexity. The question is which fragments should be selected either as anchors or targets, in one way to reflect the rich content of a dataset, and meanwhile to minimize the risk of frustrating user experience. This paper provides some insights to this question from the perspective of hubness and local intrinsic dimensionality, which are two statistical properties in assessing the popularity and complexity of data space. Based these properties, two novel algorithms are proposed for low-risk automatic selection of anchors and targets.

[1]  Pascale Sébillot,et al.  Exploiting Multimodality in Video Hyperlinking to Improve Target Diversity , 2017, MMM.

[2]  Peter J. Bickel,et al.  Maximum Likelihood Estimation of Intrinsic Dimension , 2004, NIPS.

[3]  Werner Bailer,et al.  Context in Video Search: Is Close-by Good Enough When Using Linking? , 2014, ICMR.

[4]  Rik Van de Walle,et al.  Multimedia information seeking through search and hyperlinking , 2013, ICMR.

[5]  M. J. D. Powell,et al.  Nonlinear Programming—Sequential Unconstrained Minimization Techniques , 1969 .

[6]  Martha Larson,et al.  Multimodal Video-to-Video Linking: Turning to the Crowd for Insight and Evaluation , 2017, MMM.

[7]  Shuicheng Yan,et al.  Dense Neighborhoods on Affinity Graph , 2011, International Journal of Computer Vision.

[8]  Georges Quénot,et al.  TRECVID 2015 - An Overview of the Goals, Tasks, Data, Evaluation Mechanisms and Metrics , 2011, TRECVID.

[9]  Arthur Flexer,et al.  A MIREX Meta-analysis of Hubness in Audio Music Similarity , 2012, ISMIR.

[10]  Maria Eskevich,et al.  Linking inside a video collection: what and how to measure? , 2013, WWW.

[11]  Dunja Mladenic,et al.  The Role of Hubness in Clustering High-Dimensional Data , 2011, IEEE Transactions on Knowledge and Data Engineering.

[12]  Alexandros Nanopoulos,et al.  Hubs in Space: Popular Nearest Neighbors in High-Dimensional Data , 2010, J. Mach. Learn. Res..

[13]  Jonathan G. Fiscus,et al.  TRECVID 2016: Evaluating Video Search, Video Event Detection, Localization, and Hyperlinking , 2016, TRECVID.

[14]  Marie-Francine Moens,et al.  Hierarchical Topic Models for Language-based Video Hyperlinking , 2015, SLAM@ACM Multimedia.

[15]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Dunja Mladenic,et al.  The Role of Hubness in Clustering High-Dimensional Data , 2014, IEEE Trans. Knowl. Data Eng..

[17]  Maria Eskevich,et al.  SAVA at MediaEval 2015: Search and Anchoring in Video Archives , 2015, MediaEval.

[18]  Maria Eskevich,et al.  The Search and Hyperlinking Task at MediaEval 2013 , 2013, MediaEval.

[19]  Kenji Fukumizu,et al.  Localized Centering: Reducing Hubness in Large-Sample Data , 2015, AAAI.

[20]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[21]  Ken-ichi Kawarabayashi,et al.  Estimating Local Intrinsic Dimensionality , 2015, KDD.

[22]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[23]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[24]  Raphaël Troncy,et al.  Automatic fine-grained hyperlinking of videos within a closed collection using scene segmentation , 2014, ACM Multimedia.

[25]  Yang Yang,et al.  Multimedia Summarization for Social Events in Microblog Stream , 2015, IEEE Transactions on Multimedia.

[26]  Benoit Huet,et al.  Video hyperlinking , 2014, ACM Multimedia.

[27]  Jiebo Luo,et al.  Semantic Video Entity Linking Based on Visual Content and Metadata , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[28]  Martha Larson,et al.  Blip10000: a social video dataset containing SPUG content for tagging and retrieval , 2013, MMSys.