Interpretable video tag recommendation with multimedia deep learning framework

PurposeTags help promote customer engagement on video-sharing platforms. Video tag recommender systems are artificial intelligence-enabled frameworks that strive for recommending precise tags for videos. Extant video tag recommender systems are uninterpretable, which leads to distrust of the recommendation outcome, hesitation in tag adoption and difficulty in the system debugging process. This study aims at constructing an interpretable and novel video tag recommender system to assist video-sharing platform users in tagging their newly uploaded videos.Design/methodology/approachThe proposed interpretable video tag recommender system is a multimedia deep learning framework composed of convolutional neural networks (CNNs), which receives texts and images as inputs. The interpretability of the proposed system is realized through layer-wise relevance propagation.FindingsThe case study and user study demonstrate that the proposed interpretable multimedia CNN model could effectively explain its recommended tag to users by highlighting keywords and key patches that contribute the most to the recommended tag. Moreover, the proposed model achieves an improved recommendation performance by outperforming state-of-the-art models.Practical implicationsThe interpretability of the proposed recommender system makes its decision process more transparent, builds users’ trust in the recommender systems and prompts users to adopt the recommended tags. Through labeling videos with human-understandable and accurate tags, the exposure of videos to their target audiences would increase, which enhances information technology (IT) adoption, customer engagement, value co-creation and precision marketing on the video-sharing platform.Originality/valueThe proposed model is not only the first explainable video tag recommender system but also the first explainable multimedia tag recommender system to the best of our knowledge.

[1]  Patrick Mikalef,et al.  Truth or Dare? - How can we Influence the Adoption of Artificial Intelligence in Municipalities? , 2021, HICSS.

[2]  Sameer Singh,et al.  “Why Should I Trust You?”: Explaining the Predictions of Any Classifier , 2016, NAACL.

[3]  Mark A. Neerincx,et al.  Interpretable confidence measures for decision support systems , 2020, Int. J. Hum. Comput. Stud..

[4]  Xu Chen,et al.  Explainable Recommendation: A Survey and New Perspectives , 2018, Found. Trends Inf. Retr..

[5]  Alexander Binder,et al.  Explaining nonlinear classification decisions with deep Taylor decomposition , 2015, Pattern Recognit..

[6]  WangWei,et al.  Recommender system application developments , 2015 .

[7]  Baoxin Li,et al.  CLARE: A Joint Approach to Label Classification and Tag Recommendation , 2017, AAAI.

[8]  Dietmar Jannach,et al.  A systematic review and taxonomy of explanations in decision support and recommender systems , 2017, User Modeling and User-Adapted Interaction.

[9]  Detmar W. Straub,et al.  Examining Trust in Information Technology Artifacts: The Effects of System Quality and Culture , 2008, J. Manag. Inf. Syst..

[10]  Josep Lluís de la Rosa i Esteva,et al.  Developing trust in recommender agents , 2002, AAMAS '02.

[11]  Wei Wang,et al.  Recommender system application developments: A survey , 2015, Decis. Support Syst..

[12]  Klaus-Robert Müller,et al.  "What is relevant in a text document?": An interpretable machine learning approach , 2016, PloS one.

[13]  Tao Chen,et al.  TriRank: Review-aware Explainable Recommendation by Modeling Aspects , 2015, CIKM.

[14]  Yi Zheng,et al.  Reading the Videos: Temporal Labeling for Crowdsourced Time-Sync Videos Based on Semantic Embedding , 2016, AAAI.

[15]  Chao Yang,et al.  Sentiment Enhanced Multi-Modal Hashtag Recommendation for Micro-Videos , 2020, IEEE Access.

[16]  Bernard J. Jansen,et al.  Classifying online corporate reputation with machine learning: a study in the banking domain , 2019, Internet Res..

[17]  Thomas Hess,et al.  A factual and perceptional framework for assessing diversity effects of online recommender systems , 2019, Internet Res..

[18]  Yue Yin,et al.  Explainable Recommendation via Multi-Task Learning in Opinionated Text Data , 2018, SIGIR.

[19]  Tao Li,et al.  A decision-making framework for precision marketing , 2015, Expert Syst. Appl..

[20]  JoongHo Ahn,et al.  Predictive value of video-sharing behavior: sharing of movie trailers and box-office revenue , 2017, Internet Res..

[21]  V. Kumar,et al.  Customer engagement: the construct, antecedents, and consequences , 2016, Journal of the Academy of Marketing Science.

[22]  Roelof van Zwol,et al.  Flickr tag recommendation based on collective knowledge , 2008, WWW.

[23]  Mario Pérez-Montoro,et al.  Making Video News Visible: Identifying the Optimization Strategies of the Cybermedia on YouTube Using Web Metrics , 2019, Journalism Practice.

[24]  Raymond Sheh,et al.  "Why Did You Do That?" Explainable Intelligent Robots , 2017, AAAI Workshops.

[25]  Lei Zhu,et al.  Personalized Hashtag Recommendation for Micro-videos , 2019, ACM Multimedia.

[26]  Avi Rosenfeld,et al.  A Survey of Interpretability and Explainability in Human-Agent Systems , 2018 .

[27]  Izak Benbasat,et al.  The Effects of Personalizaion and Familiarity on Trust and Adoption of Recommendation Agents , 2006, MIS Q..

[28]  Ivania Donoso-Guzmán,et al.  The effect of explanations and algorithmic accuracy on visual recommender systems of artistic images , 2019, IUI.

[29]  Tommaso Di Noia,et al.  Knowledge-aware Autoencoders for Explainable Recommender Systems , 2018, DLRS@RecSys.

[30]  Zekun Yang,et al.  Causally Denoise Word Embeddings Using Half-Sibling Regression , 2019, AAAI.

[31]  Akio Kobayashi,et al.  Estimation of Tags via Comments on Nico Nico Douga , 2016, 2016 19th International Conference on Network-Based Information Systems (NBiS).

[32]  Prakhar Gupta,et al.  Learning Word Vectors for 157 Languages , 2018, LREC.

[33]  Liqiang Nie,et al.  Long-tail Hashtag Recommendation for Micro-videos with Graph Convolutional Network , 2019, CIKM.

[34]  Eoin M. Kenny,et al.  Explaining black-box classifiers using post-hoc explanations-by-example: The effect of explanations and error-rates in XAI user studies , 2021, Artif. Intell..

[35]  R. Brodie,et al.  Customer Engagement , 2011 .

[36]  Klaus-Robert Müller,et al.  Layer-Wise Relevance Propagation: An Overview , 2019, Explainable AI.

[37]  Raymond Y. K. Lau,et al.  Parallel Aspect‐Oriented Sentiment Analysis for Sales Forecasting with Big Data , 2018 .

[38]  Thomas G. Dietterich,et al.  Sequential Feature Explanations for Anomaly Detection , 2019, ACM Trans. Knowl. Discov. Data.

[39]  Wondwesen Tafesse,et al.  YouTube marketing: how marketers' video optimization practices influence video views , 2020, Internet Res..

[40]  Raymond Sheh,et al.  Different XAI for Different HRI , 2017, AAAI Fall Symposia.

[41]  Javier Escobar-Avila,et al.  Automatic Tag Recommendation for Software Development Video Tutorials , 2018, 2018 IEEE/ACM 26th International Conference on Program Comprehension (ICPC).

[42]  Wei Zhao,et al.  Time-Sync Video Tag Extraction Using Semantic Association Graph , 2019, ACM Trans. Knowl. Discov. Data.

[43]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[44]  Luciano Sbaiz,et al.  Finding meaning on YouTube: Tag recommendation and category discovery , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[45]  Jussara M. Almeida,et al.  A survey on tag recommendation methods , 2017, J. Assoc. Inf. Sci. Technol..

[46]  Alexander Binder,et al.  On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , 2015, PloS one.