TSCSet: A Crowdsourced Time-Sync Comment Dataset for Exploration of User Experience Improvement

Time-Sync Comment (TSC) is a type of crowdsourced user review embedded in online video websites, which provides better real-time user interaction than traditional user comment type. Various TSC-related problems and approaches have been studied to improve user experience by taking advantage of special characteristics of TSCs such as strong time reliance. However, there are three major drawbacks to these TSC researches. First, they did not explicitly show advantage of TSC features over the traditional features in terms of users' experience. Second, the experiments were conducted on some inconsistent TSC datasets crawled from different source, which makes the effectiveness of their methods less convincing. Third, the methods were manually evaluated by a limited number of so-called "experts" in these experiments, so it is hard for other researchers to obtain the data labels and reproduce the results. In order to overcome these drawbacks, this paper aims to explore the usefulness of TSC data for for the improvement of user experience online by exploiting the TSC pattern inside a new dataset. Specifically, we present a larger-scale TSC dataset with four-level structures and rich self-labeled attributes and formally define a group of TSC-related research problems based on this dataset. The problems are solved by adapted state-of-the-art methods and evaluated through crowdsourced labels in the dataset. The result can be regarded as a baseline for further research.

[1]  Yong Yu,et al.  Sembler: Ensembling Crowd Sequential Labeling for Improved Quality , 2012, AAAI.

[2]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[3]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[4]  Le Wu,et al.  Predicting the Popularity of DanMu-enabled Videos: A Multi-factor View , 2016, DASFAA.

[5]  Tong Zhang,et al.  Effective Use of Word Order for Text Categorization with Convolutional Neural Networks , 2014, NAACL.

[6]  Chenxi Zhang,et al.  Video Highlight Shot Extraction with Time-Sync Comment , 2015, HOTPOST@MobiHoc.

[7]  Chao Zhang,et al.  Bridging Video Content and Comments: Synchronized Video Description with Temporal Summarization of Crowdsourced Time-Sync Comments , 2017, AAAI.

[8]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[9]  Enhong Chen,et al.  Exploring the Emerging Type of Comment for Online Videos , 2017, ACM Trans. Web.

[10]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[11]  Uma Mudenagudi,et al.  A Study on Keyframe Extraction Methods for Video Summary , 2011, 2011 International Conference on Computational Intelligence and Communication Networks.

[12]  Markus Koch,et al.  Learning automatic concept detectors from online video , 2010, Comput. Vis. Image Underst..

[13]  Meng Wang,et al.  Event Driven Web Video Summarization by Tag Localization and Key-Shot Identification , 2012, IEEE Transactions on Multimedia.

[14]  Tianming Liu,et al.  A novel video key-frame-extraction algorithm based on perceived motion energy model , 2003, IEEE Trans. Circuits Syst. Video Technol..

[15]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[16]  Yong-Yeol Ahn,et al.  Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems , 2009, IEEE/ACM Transactions on Networking.

[17]  Niklas Carlsson,et al.  The untold story of the clones: content-agnostic factors that impact YouTube video popularity , 2012, KDD.

[18]  Eisuke Ito,et al.  Correlation Analysis between User's Emotional Comments and Popularity Measures , 2014, 2014 IIAI 3rd International Conference on Advanced Applied Informatics.

[19]  Tobias Hoßfeld,et al.  Evaluation in the Crowd: An Introduction , 2015, Crowdsourcing and Human-Centered Experiments.

[20]  Jing Wang,et al.  Event Detection on Online Videos Using Crowdsourced Time-Sync Comment , 2016, 2016 7th International Conference on Cloud Computing and Big Data (CCBD).

[21]  Trevor Darrell,et al.  Sequence to Sequence -- Video to Text , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[22]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[23]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[24]  Weijia Jia,et al.  Crowdsourced time-sync video tagging using semantic association graph , 2017, 2017 IEEE International Conference on Multimedia and Expo (ICME).

[25]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[26]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[27]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Hal Daumé,et al.  Deep Unordered Composition Rivals Syntactic Methods for Text Classification , 2015, ACL.

[29]  Yi Zheng,et al.  Reading the Videos: Temporal Labeling for Crowdsourced Time-Sync Videos Based on Semantic Embedding , 2016, AAAI.

[30]  Xin Liu,et al.  Video summarization using singular value decomposition , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[31]  Ba Tu Truong,et al.  Video abstraction: A systematic review and classification , 2007, TOMCCAP.

[32]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[33]  William Harmon,et al.  A Handbook to Literature , 1960 .

[34]  Pietro Perona,et al.  The Multidimensional Wisdom of Crowds , 2010, NIPS.

[35]  Abhimanyu Das,et al.  Debiasing social wisdom , 2013, KDD.

[36]  Niklas Carlsson,et al.  Characterizing web-based video sharing workloads , 2009, WWW '09.

[37]  Qiang Yang,et al.  Crowdsourced time-sync video tagging using temporal and personalized topic modeling , 2014, KDD.

[38]  Yongfeng Zhang,et al.  Personalized Key Frame Recommendation , 2017, SIGIR.

[39]  Mark Sanderson,et al.  Automatic video tagging using content redundancy , 2009, SIGIR.

[40]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.