Modeling Social Readers: Novel Tools for Addressing Reception from Online Book Reviews

Social reading sites offer an opportunity to capture a segment of readers’ responses to literature, while data-driven analysis of these responses can provide new critical insight into how people ‘read’. Posts discussing an individual book on the social reading site, Goodreads, are referred to as ‘reviews’, and consist of summaries, opinions, quotes or some mixture of these. Computationally modelling these reviews allows one to discover the non-professional discussion space about a work, including an aggregated summary of the work’s plot, an implicit sequencing of various subplots and readers’ impressions of main characters. We develop a pipeline of interlocking computational tools to extract a representation of this reader-generated shared narrative model. Using a corpus of reviews of five popular novels, we discover readers’ distillation of the novels’ main storylines and their sequencing, as well as the readers’ varying impressions of characters in the novel. In so doing, we make three important contributions to the study of infinite-vocabulary networks: (i) an automatically derived narrative network that includes meta-actants; (ii) a sequencing algorithm, REV2SEQ, that generates a consensus sequence of events based on partial trajectories aggregated from reviews, and (iii) an ‘impressions’ algorithm, SENT2IMP, that provides multi-modal insight into readers’ opinions of characters.

[1]  W. Anderson Kaiser und Abt : die Geschichte eines Schwanks , .

[2]  Joseph Campbell,et al.  The Hero with a Thousand Faces , 1949 .

[3]  W. Wimsatt The Verbal Icon , 1955 .

[4]  Vladimir Propp,et al.  Morphology of the folktale , 1959 .

[5]  Algirdas Julien Greimas,et al.  Éléments pour une théorie de l'interprétation du récit mythique , 1966 .

[6]  Roger C. Schank,et al.  Computer Models of Thought and Language , 1974 .

[7]  Ira P. Goldstein,et al.  Artificial Intelligence, Language, and the Study of Knowledge , 1977, Cogn. Sci..

[8]  V. Dijk,et al.  Story comprehension: An introduction , 1980 .

[9]  S. Chatman Story and Discourse: Narrative Structure in Fiction and Film , 1980 .

[10]  W. Iser Texts and readers , 1980 .

[11]  Susan Miller,et al.  Is There a Text in This Class , 1982 .

[12]  Steven Mailloux Interpretive Conventions: The Reader in the Study of American Fiction , 1982 .

[13]  B. Anderson,et al.  Imagined Communities: Reflections on the Origins and Spread of Nationalism , 1986 .

[14]  Patricia Galloway,et al.  Narrative theories as computational models: Reader-oriented theory and artificial intelligence , 1983, Comput. Humanit..

[15]  Lane Gormley,et al.  Desire in Language: A Semiotic Approach to Literature and Art , 1984 .

[16]  D. Seed Theory of Prose , 1991 .

[17]  Julia Marlén Baquero Velásquez "Les actants, les acteurs et les figures”, de A. J. Greimas , 1991 .

[18]  Jonathan Boyarin Textual Interpretation As Collective Action , 1993 .

[19]  C. W. Morris Imagined communities: Reflections on the origin and spread of nationalism , 1995 .

[20]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[21]  钟维尧,et al.  To Kill a Mocking Bird , 2019, History Research Journal.

[22]  James D. Faubion,et al.  The Ethnography of Reading , 1997 .

[23]  D. Joanes,et al.  Comparing measures of sample skewness and kurtosis , 1998 .

[24]  Daniel Zwillinger,et al.  CRC Standard Probability and Statistics Tables and Formulae, Student Edition , 1999 .

[25]  G. Brenner Performative Criticism: Experiments in Reader Response , 2004 .

[26]  Branimir Boguraev,et al.  TimeBank-Driven TimeML Analysis , 2005, Annotating, Extracting and Reasoning about Time and Events.

[27]  T. Powledge What Is the Hobbit? , 2006, PLoS biology.

[28]  Apoorva Mandavilli,et al.  Of mice and men , 2006, Nature Medicine.

[29]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[30]  T. Dalrymple The death of the author , 2008, BMJ : British Medical Journal.

[31]  J. Swann,et al.  Reading groups and the language of literary texts: a case study in social reading , 2009 .

[32]  James Clarence Mangan,et al.  Reader-Response Criticism : From Formalism to Post-Structuralism , 2009 .

[33]  C. A. Weaver,et al.  Psychology of Reading , 2012 .

[34]  D. Allington,et al.  Reading the reading experience: an ethnomethodological approach to 'booktalk' , 2012 .

[35]  Sara Klingenstein,et al.  Bootstrap Methods for the Empirical Study of Decision-Making and Information Flows in Social Systems , 2013, Entropy.

[36]  Lisa Nakamura,et al.  “Words with Friends”: Socially Networked Reading on Goodreads , 2013, PMLA/Publications of the Modern Language Association of America.

[37]  George Orwell,et al.  Animal farm , 2014, Nature.

[38]  Markus Krötzsch,et al.  Wikidata , 2014, Commun. ACM.

[39]  Khairullah Khan,et al.  Mining opinion components from unstructured reviews: A review , 2014, J. King Saud Univ. Comput. Inf. Sci..

[40]  Christopher D. Manning,et al.  Leveraging Linguistic Structure For Open Domain Information Extraction , 2015, ACL.

[41]  J. Bittl Go Set a Watchman? , 2015, JACC. Cardiovascular interventions.

[42]  Justin Zhijun Zhan,et al.  Sentiment analysis using product review data , 2015, Journal of Big Data.

[43]  Christof Schöch,et al.  Computational Narratology , 2015, DHd.

[44]  J. Swann,et al.  The Discourse of Reading Groups: Integrating Cognitive and Sociocultural Perspectives , 2015 .

[45]  Derek Ruths,et al.  Goodreads Versus Amazon: The Effect of Decoupling Book Reviewing And Book Selling , 2015, ICWSM.

[46]  Mark A. Finlayson Inferring Propp’s Functions from Semantically Annotated Text , 2016 .

[47]  Christina Gloeckner,et al.  Is There A Text In This Class The Authority Of Interpretive Communities , 2016 .

[48]  Timothy R. Tangherlini,et al.  “Mommy Blogs” and the Vaccination Exemption Narrative: Results From A Machine-Learning Approach for Story Aggregation on Parenting Social Media Sites , 2016, JMIR public health and surveillance.

[49]  J. Round,et al.  Moderating readers and reading online , 2016 .

[50]  Mike Thelwall,et al.  Goodreads: A social network site for book readers , 2017, J. Assoc. Inf. Sci. Technol..

[51]  Anirban Laha,et al.  Story Generation from Sequence of Independent Short Descriptions , 2017, ArXiv.

[52]  Animesh Mukherjee,et al.  Book Reading Behavior on Goodreads Can Predict the Amazon Best Sellers , 2017, 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[53]  Leland McInnes,et al.  hdbscan: Hierarchical density based clustering , 2017, J. Open Source Softw..

[54]  Markus H. Gross,et al.  InspireMe: Learning Sequence Models for Stories , 2018, AAAI.

[55]  S. Rebora,et al.  A New Research Programme for Reading Research: Analysing Comments in the Margins on Wattpad , 2018 .

[56]  Beth Driscoll,et al.  Faraway, So Close: Seeing the Intimacy in Goodreads Reviews , 2018, Qualitative Inquiry.

[57]  Mengting Wan,et al.  Item recommendation on monotonic behavior chains , 2018, RecSys.

[58]  Mengting Wan,et al.  Fine-Grained Spoiler Detection from Large-Scale Review Corpora , 2019, ACL.

[59]  Lala Hajibayova,et al.  Investigation of Goodreads' reviews: Kakutanied, deceived or simply honest? , 2019, J. Documentation.

[60]  Iryna Gurevych,et al.  Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks , 2019, EMNLP.

[61]  Simon Rowberry,et al.  The limits of Big Data for analyzing reading , 2019 .

[62]  Thomas C. Messerli,et al.  Digital humanities and digital social reading , 2019, Digit. Scholarsh. Humanit..

[63]  Patrice Bellot,et al.  Using Sentiment Analysis for Pseudo-Relevance Feedback in Social Book Search , 2020, ICTIR.

[64]  S. Rebora,et al.  Wattpad as a resource for literary studies. Quantitative and qualitative examples of the importance of digital social reading and readers’ comments in the margins , 2020, PloS one.

[65]  Timothy R. Tangherlini,et al.  Conspiracy in the time of corona: automatic detection of emerging COVID-19 conspiracy theories in social media and the news , 2020, Journal of Computational Social Science.

[66]  Behnam Shahbazi,et al.  An automated pipeline for the discovery of conspiracy and conspiracy theory narrative frameworks: Bridgegate, Pizzagate and storytelling on the web , 2020, PloS one.

[67]  Behnam Shahbazi,et al.  An Automated Pipeline for Character and Relationship Extraction from Readers Literary Book Reviews on Goodreads.com , 2020, WebSci.

[68]  Fabian M. Suchanek,et al.  Machine Knowledge: Creation and Curation of Comprehensive Knowledge Bases , 2020, Found. Trends Databases.

[69]  Maria Antoniak Tags, Borders, and Catalogs , 2021, Proc. ACM Hum. Comput. Interact..

[70]  Luciana Marchionne Picchione,et al.  The Act of Reading : A Theory of Aesthetic Response , 2022 .