Mining and Modeling Character Networks

We investigate social networks of characters found in cultural works such as novels and films. These character networks exhibit many of the properties of complex networks such as skewed degree distribution and community structure, but may be of relatively small order with a high multiplicity of edges. Building on recent work of Beveridge and Shan [4], we consider graph extraction, visualization, and network statistics for three novels: Twilight by Stephanie Meyer, Steven King’s The Stand, and J.K. Rowling’s Harry Potter and the Goblet of Fire. Coupling with 800 character networks from films found in the http://moviegalaxies.com/ database, we compare the data sets to simulations from various stochastic complex networks models including random graphs with given expected degrees (also known as the Chung-Lu model), the configuration model, and the preferential attachment model. Using machine learning techniques based on motif (or small subgraph) counts, we determine that the Chung-Lu model best fits character networks and we conjecture why this may be the case.

[1]  Alexandros G. Dimakis,et al.  Beyond Triangles: A Distributed Framework for Estimating 3-profiles of Large Graphs , 2015, KDD.

[2]  Alexandros G. Dimakis,et al.  Distributed Estimation of Graph 4-Profiles , 2016, WWW.

[3]  P. M. Gleiser How to become a superhero , 2007 .

[4]  Mirella Lapata,et al.  Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics , 1999, ACL 1999.

[5]  Kathleen McKeown,et al.  Extracting Social Networks from Literary Fiction , 2010, ACL.

[6]  Kathryn Fraughnaugh,et al.  Introduction to graph theory , 1973, Mathematical Gazette.

[7]  Jeannette C. M. Janssen,et al.  Model Selection for Social Networks Using Graphlets , 2012, Internet Math..

[8]  Mathieu Bastian,et al.  Gephi: An Open Source Software for Exploring and Manipulating Networks , 2009, ICWSM.

[9]  Christopher T. Workman,et al.  DASS: efficient discovery and p-value calculation of substructures in unordered data , 2007, Bioinform..

[10]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[11]  Christopher M. Danforth,et al.  The emotional arcs of stories are dominated by six basic shapes , 2016, EPJ Data Science.

[12]  Anthony Bonato,et al.  Dimensionality of Social Networks Using Motifs and Eigenvalues , 2014, PloS one.

[13]  J. Miro-Julia,et al.  Marvel Universe looks almost like a real social network , 2002 .

[14]  Mauricio Aparecido Ribeiro,et al.  The complex social network from The Lord of The Rings , 2015, ArXiv.

[15]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[16]  Anthony Bonato,et al.  Complex Networks and Social Networks , 2012 .

[17]  Graham Sack Character Networks for Narrative Generation , 2012, INT@AIIDE.

[18]  Shai Ben-David,et al.  Understanding Machine Learning: From Theory to Algorithms , 2014 .

[19]  Natasa Przulj,et al.  Biological network comparison using graphlet degree distribution , 2007, Bioinform..

[20]  Owen Rambow,et al.  Social Network Analysis of Alice in Wonderland , 2012, CLfL@NAACL-HLT.

[21]  Anthony Bonato,et al.  A course on the Web graph , 2008 .

[22]  Andrew Beveridge,et al.  Network of Thrones , 2016 .

[23]  Kurt Mehlhorn,et al.  Efficient graphlet kernels for large graph comparison , 2009, AISTATS.