Violence Rating Prediction from Movie Scripts

Violent content in movies can influence viewers’ perception of the society. For example, frequent depictions of certain demographics as perpetrators or victims of abuse can shape stereotyped attitudes. In this work, we propose to characterize aspects of violent content in movies solely from the language used in the scripts. This makes our method applicable to a movie in the earlier stages of content creation even before it is produced. This is complementary to previous works which rely on audio or video post production. Our approach is based on a broad range of features designed to capture lexical, semantic, sentiment and abusive language characteristics. We use these features to learn a vector representation for (1) complete movie, and (2) for an act in the movie. The former representation is used to train a movie-level classification model, and the latter, to train deep-learning sequence classifiers that make use of context. We tested our models on a dataset of 732 Hollywood scripts annotated by experts for violent content. Our performance evaluation suggests that linguistic features are a good indicator for violent content. Furthermore, our ablation studies show that semantic and sentiment features are the most important predictors of violence in this data. To date, we are the first to show the language used in movie scripts is a strong indicator of violent content. This offers novel computational tools to assist in creating awareness of storytelling.

[1]  K M Thompson,et al.  Violence in G-rated animated films. , 2000, JAMA.

[2]  Athena Vakali,et al.  A Unified Deep Learning Architecture for Abuse Detection , 2018, WebSci.

[3]  A. Rubin,et al.  Viewer Aggression and Homophily, Identification, and Parasocial Relationships With Television Characters , 2003 .

[4]  Lucy Vasserman,et al.  Measuring and Mitigating Unintended Bias in Text Classification , 2018, AIES.

[5]  John L. Sherry,et al.  The Appeal of Media Violence in a Full‐length Motion Picture: An Experimental Investigation , 2005 .

[6]  Mirella Lapata,et al.  What’s This Movie About? A Joint Neural Network Architecture for Movie Content Analysis , 2018, NAACL.

[7]  C. Anderson,et al.  Effects of Violent Video Games on Aggressive Behavior, Aggressive Cognition, Aggressive Affect, Physiological Arousal, and Prosocial Behavior: A Meta-Analytic Review of the Scientific Literature , 2001, Psychological science.

[8]  Fumie Yokota,et al.  Violence, sex and profanity in films: correlation of movie ratings with content. , 2004, MedGenMed : Medscape general medicine.

[9]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[10]  Jing Zhou,et al.  Hate Speech Detection with Comment Embeddings , 2015, WWW.

[11]  Lucas Dixon,et al.  Ex Machina: Personal Attacks Seen at Scale , 2016, WWW.

[12]  Yuzhou Wang,et al.  Locate the Hate: Detecting Tweets against Blacks , 2013, AAAI.

[13]  Shrikanth S. Narayanan,et al.  Linguistic analysis of differences in portrayal of movie characters , 2017, ACL.

[14]  John Pavlopoulos,et al.  Deeper Attention to Abusive User Content Moderation , 2017, EMNLP.

[15]  Karen E. Dill,et al.  Video games and aggressive thoughts, feelings, and behavior in the laboratory and in life. , 2000, Journal of personality and social psychology.

[16]  Pascale Fung,et al.  One-step and Two-step Classification for Abusive Language Detection on Twitter , 2017, ALW@ACL.

[17]  Michael Wiegand,et al.  A Survey on Hate Speech Detection using Natural Language Processing , 2017, SocialNLP@EACL.

[18]  David Robinson,et al.  Detecting Hate Speech on Twitter Using a Convolution-GRU Based Deep Neural Network , 2018, ESWC.

[19]  Karen Spärck Jones A statistical interpretation of term specificity and its application in retrieval , 2021, J. Documentation.

[20]  Cornelia Caragea,et al.  Content-Driven Detection of Cyberbullying on the Instagram Social Network , 2016, IJCAI.

[21]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[22]  Yejin Choi,et al.  Connotation Frames of Power and Agency in Modern Films , 2017, EMNLP.

[23]  J. Igartua Identification with characters and narrative persuasion through fictional feature films , 2010 .

[24]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[25]  Marcin Mironczuk,et al.  A recent overview of the state-of-the-art elements of text classification , 2018, Expert Syst. Appl..

[26]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[27]  Xi Wang,et al.  Fudan-Huawei at MediaEval 2015: Detecting Violent Scenes and Affective Impact in Movies with Deep Learning , 2015, MediaEval.

[28]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[29]  Ryan L. Boyd,et al.  The Development and Psychometric Properties of LIWC2015 , 2015 .

[30]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[31]  Patrick M. Markey,et al.  Violent movies and severe acts of violence: sensationalism versus science , 2015 .

[32]  Joel R. Tetreault,et al.  Do Characters Abuse More Than Words? , 2016, SIGDIAL Conference.

[33]  M. Bowie Media violence. , 1997, South African medical journal = Suid-Afrikaanse tydskrif vir geneeskunde.

[34]  Michael S. Bernstein,et al.  Empath: Understanding Topic Signals in Large-Scale Text , 2016, CHI.

[35]  Norhalina Senan,et al.  Violence Video Classification Performance Using Deep Neural Networks , 2018, SCDM.

[36]  Rafael E. Banchs Movie-DiC: a Movie Dialogue Corpus for Research and Development , 2012, ACL.

[37]  Raymond E. Barranco,et al.  Violence at the Box Office , 2017, Commun. Res..

[38]  Joel R. Tetreault,et al.  Abusive Language Detection in Online User Content , 2016, WWW.

[39]  Nora D. Volkow,et al.  Reactions to Media Violence: It’s in the Brain of the Beholder , 2014, PloS one.

[40]  Ming Zhou,et al.  Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification , 2014, ACL.

[41]  Marshall S. Smith,et al.  The general inquirer: A computer approach to content analysis. , 1967 .

[42]  Michael Wiegand,et al.  Inducing a Lexicon of Abusive Words – a Feature-Based Approach , 2018, NAACL.

[43]  D. Zillmann Excitation transfer in communication-mediated aggressive behavior , 1971 .

[44]  Matthew Leighton Williams,et al.  Cyber Hate Speech on Twitter: An Application of Machine Classification and Statistical Modeling for Policy and Decision Making , 2015 .

[45]  Theresa Webb,et al.  Violent Entertainment Pitched to Adolescents: An Analysis of PG-13 Films , 2007, Pediatrics.

[46]  Eric Gilbert,et al.  VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text , 2014, ICWSM.

[47]  Finn Årup Nielsen,et al.  A New ANEW: Evaluation of a Word List for Sentiment Analysis in Microblogs , 2011, #MSM.

[48]  Ingmar Weber,et al.  Automated Hate Speech Detection and the Problem of Offensive Language , 2017, ICWSM.

[49]  Ingmar Weber,et al.  Understanding Abuse: A Typology of Abusive Language Detection Subtasks , 2017, ALW@ACL.

[50]  Misha W. Vaughan,et al.  How real is the portrayal of aggression in television entertainment programming , 1995 .