Volunteered geographic information production as a spatial process

Wikipedia is a free encyclopedia that anyone can edit and a popular example of user-generated content that includes volunteered geographic information (VGI). In this article, we present three main contributions: (1) a spatial data model and collection methods to study VGI in systems that may not explicitly support geographic data; (2) quantitative methods for measuring distance between online authors and articles; and (3) empirically calibrated results from a gravity model of the role of distance in VGI production. To model spatial processes of VGI contributors, we use an invariant exponential gravity model based on article and author proximity. We define a proximity metric called a ‘signature distance’ as a weighted average distance between an article and each of its authors, and we estimate the location of 2.8 million anonymous authors through IP geolocation. Our study collects empirical data directly from 21 language-specific Wikipedia databases, spanning 7 years of contributions (2001–2008) to nearly 1 million geotagged articles. We find empirical evidence that the spatial processes of anonymous contributors fit an exponential distance decay model. Our results are consistent with the prior results on information diffusion as a spatial process, but run counter to theories that a globalized Internet neutralizes distance as a determinant of social behaviors.

[1]  Mikhil Masli,et al.  Eliciting and focusing geographic volunteer work , 2010, CSCW '10.

[2]  Matthew Zook,et al.  The geographies of the internet , 2006, Annu. Rev. Inf. Sci. Technol..

[3]  Don Tapscott,et al.  Wikinomics: How Mass Collaboration Changes Everything , 2006 .

[4]  Marcos André Gonçalves,et al.  Geographical classification of documents using evidence from Wikipedia , 2010, GIR.

[5]  Dana S. Richards,et al.  Statistical Geolocation of Internet Hosts , 2009, 2009 Proceedings of 18th International Conference on Computer Communications and Networks.

[6]  Simone Paolo Ponzetto,et al.  WikiRelate! Computing Semantic Relatedness Using Wikipedia , 2006, AAAI.

[7]  Morton E. O'Kelly,et al.  Spatial Interaction Models:Formulations and Applications , 1988 .

[8]  D. Murphey,et al.  The World Is Flat: A Brief History of the Twenty-First Century , 2006 .

[9]  Peng Qi,et al.  The Evolution of Wikipedia , 2013 .

[10]  Aniket Kittur,et al.  He says, she says: conflict and coordination in Wikipedia , 2007, CHI.

[11]  Jim Gray,et al.  Microsoft TerraServer: a spatial data warehouse , 1999, SIGMOD '00.

[12]  Alberto O. Mendelzon,et al.  Querying the World Wide Web , 1997, International Journal on Digital Libraries.

[13]  Andrew Daviel,et al.  Geographic registration of HTML documents , 2007 .

[14]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[15]  Arno Scharl,et al.  The Geospatial Web: How Geobrowsers, Social Software and the Web 2.0 are Shaping the Network Society , 2007, The Geospatial Web.

[16]  Daren C. Brabham Crowdsourcing as a Model for Problem Solving , 2008 .

[17]  John A. Kunze,et al.  Encoding Dublin Core Metadata in HTML , 1999, RFC.

[18]  K. Haynes,et al.  Gravity and Spatial Interaction Models , 1985 .

[19]  Andrew Lih,et al.  Wikipedia as Participatory Journalism: Reliable Sources? Metrics for evaluating collaborative media as a news resource , 2004 .

[20]  Patricia R. Ladd,et al.  The Wikipedia revolution : how a bunch of nobodies created the world's greatest encyclopedia , 2009 .

[21]  Jimmy J. Lin,et al.  You Are Where You Edit: Locating Wikipedia Contributors through Edit Histories , 2009, ICWSM.

[22]  Mor Naaman,et al.  Methods for extracting place semantics from Flickr tags , 2009, TWEB.

[23]  Yochai Benkler,et al.  Coase's Penguin, or Linux and the Nature of the Firm , 2001, ArXiv.

[24]  Panayiotis Zaphiris,et al.  Cultural Differences in Collaborative Authoring of Wikipedia , 2006, J. Comput. Mediat. Commun..

[25]  Timothy W. Foresman,et al.  Evolution and implementation of the Digital Earth vision, technology and society , 2008, Int. J. Digit. Earth.

[26]  John A. Kunze,et al.  RFC 2731 ("Encoding Dublin Core Metadata in HTML") Is Obsolete , 2010, RFC.

[27]  Chuck Smith,et al.  WikiCreole:: a common wiki markup , 2007, WikiSym '07.

[28]  Deborah L. McGuinness,et al.  Computing trust from revision history , 2006, PST.

[29]  Serge Fdida,et al.  Investigating the Imprecision of IP Block-Based Geolocation , 2007, PAM.

[30]  Lior Rokach,et al.  Data Mining And Knowledge Discovery Handbook , 2005 .

[31]  Martin Wattenberg,et al.  Studying cooperation and conflict between authors with history flow visualizations , 2004, CHI.

[32]  S. Elwood Geographic information science: emerging research on the societal implications of the geospatial web , 2010 .

[33]  Patrick Weber,et al.  OpenStreetMap: User-Generated Street Maps , 2008, IEEE Pervasive Computing.

[34]  J. Sempsey The death of distance: How the communications revolution will change our lives , 1998 .

[35]  Amy Bruckman,et al.  Becoming Wikipedian: transformation of participation in a collaborative online encyclopedia , 2005, GROUP.

[36]  Slava Kisilevich,et al.  Spatio-temporal clustering , 2010, Data Mining and Knowledge Discovery Handbook.

[37]  Michael F. Goodchild,et al.  GIS as media? , 2001, Int. J. Geogr. Inf. Sci..

[38]  G. Brent Hall,et al.  Community-based production of geographic information using open source software and Web 2.0 , 2010, Int. J. Geogr. Inf. Sci..

[39]  Emily Moxley,et al.  Terabytes of Tobler: Evaluating the First Law in a Massive, Domain-Neutral Representation of World Knowledge , 2009, COSIT.

[40]  Darren R. Hardy,et al.  Discovering behavioral patterns in collective authorship of place-based information , 2008 .

[41]  LU Yong-xiang Building up the DigitalEarth Together,Sharing Global Data Resources Each Other , 2000 .

[42]  Ian Dickinson,et al.  A Means for Expressing Location Information in the Domain Name System , 1996, RFC.

[43]  Andrew Hudson-Smith,et al.  NeoGeography and Web 2.0: concepts, tools and applications , 2009, J. Locat. Based Serv..

[44]  A. Fotheringham SPATIAL STRUCTURE AND DISTANCE‐DECAY PARAMETERS , 1981, Annals of the Association of American Geographers.

[45]  M. Goodchild Citizens as sensors: the world of volunteered geography , 2007 .

[46]  Jochen L. Leidner Toponym resolution in text , 2007 .

[47]  Nigel Stanger,et al.  Scalability of Techniques for Online Geographic Visualization of Web Site Hits , 2008 .

[48]  Maureen Weicher [Name withheld]: Anonymity and its implications , 2006, ASIST.

[49]  Michael F. Goodchild Cartographic Futures On A Digital Earth , 2000 .

[50]  Haoxiang Xia,et al.  Toward collective intelligence of online communities: A primitive conceptual model , 2009 .

[51]  Jochen L. Leidner Toponym resolution in text: annotation, evaluation and applications of spatial grounding , 2007, SIGF.

[52]  John Riedl,et al.  SuggestBot: using intelligent task routing to help people find work in wikipedia , 2007, IUI '07.

[53]  Lawrence Lessig,et al.  The future of ideas - the fate of the commons in a connected world , 2002 .

[54]  Michael F. Goodchild,et al.  Defining a Digital Earth System , 2008, Trans. GIS.

[55]  S. Marston,et al.  Human geography without scale , 2005 .

[56]  Dimitris Ballas,et al.  Book Review: Scale and geographic inquiry: nature, society, and method , 2006 .

[57]  Amy Bruckman,et al.  Scaling Consensus: Increasing Decentralization in Wikipedia Governance , 2008, Proceedings of the 41st Annual Hawaii International Conference on System Sciences (HICSS 2008).

[58]  Martin Wattenberg,et al.  The Hidden Order of Wikipedia , 2007, HCI.

[59]  Alan Wilson,et al.  Notes on some concepts in social physics , 1969 .

[60]  Declan Butler,et al.  Virtual globes: The web-wide world , 2006, Nature.

[61]  Roger Burrows,et al.  Sociology and, of and in Web 2.0: Some Initial Considerations , 2007 .

[62]  Alan Wilson,et al.  A Family of Spatial Interaction Models, and Associated Developments , 1971 .

[63]  K. Clayton,et al.  Transactions of the Institute of British Geographers , 1959 .

[64]  Alan Wilson,et al.  Entropy in urban and regional modelling , 1972, Handbook on Entropy, Complexity and Spatial Dynamics.

[65]  Susan C. Herring,et al.  The Multilingual Internet: Language, Culture, and Communication Online , 2007 .

[66]  Markus Krötzsch,et al.  Semantic Wikipedia , 2006, WikiSym '06.

[67]  Tim O'Reilly,et al.  What is Web 2.0: Design Patterns and Business Models for the Next Generation of Software , 2007 .

[68]  M. Castells The rise of the network society , 1996 .

[69]  Serge Fdida,et al.  Constraint-Based Geolocation of Internet Hosts , 2004, IEEE/ACM Transactions on Networking.

[70]  Terence R. Smith,et al.  The Alexandria Digital Library architecture , 2000, International Journal on Digital Libraries.

[71]  Christopher D. Elvidge,et al.  Global Urban Mapping Based on Nighttime Lights , 2009 .

[72]  Michael F. Goodchild,et al.  Introduction to digital gazetteer research , 2008, Int. J. Geogr. Inf. Sci..

[73]  Peter Nijkamp,et al.  Exponential or Power Distance-Decay for Commuting? An Alternative Specification , 2005 .

[74]  John Riedl,et al.  Creating, destroying, and restoring value in wikipedia , 2007, GROUP.

[75]  Bertram C. Bruce,et al.  Reconceptualizing the role of the user of spatial data infrastructure , 2008 .

[76]  Thomas Wöhner,et al.  Assessing the quality of Wikipedia articles with lifecycle based metrics , 2009, Int. Sym. Wikis.

[77]  Aaron Halfaker,et al.  Wikipedians are born, not made: a study of power editors on Wikipedia , 2009, GROUP.

[78]  Paul C. van Oorschot,et al.  Internet geolocation: Evasion and counterevasion , 2009, CSUR.

[79]  Alan Wilson,et al.  Entropy in Urban and Regional Modelling: Retrospect and Prospect. 城市和区域建模中的熵:回顾与展望 , 2010 .

[80]  Stefan M. Rüger,et al.  Using co‐occurrence models for placename disambiguation , 2008, Int. J. Geogr. Inf. Sci..

[81]  Daniel Z. Sui,et al.  The wikification of GIS and its consequences: Or Angelina Jolie's new tattoo and the future of GIS , 2008, Comput. Environ. Urban Syst..

[82]  B. Danet,et al.  The Multilingual Internet , 2007 .

[83]  Tony E. Smith,et al.  Gravity Models of Spatial Interaction Behavior , 1995 .

[84]  J. Voß Measuring Wikipedia , 2005 .

[85]  P. Ingwersen,et al.  Proceedings of ISSI 2005 – The 10th International Conference of the International Society for Scientometrics and Informetrics: Stockholm, Sweden, July 24-28, 2005 , 2005 .

[86]  Richard L. Morrill,et al.  Marriage, Migration, and the Mean Information Field: A Study in Uniqueness and Generality , 1967 .

[87]  Darren Gergle,et al.  On the "localness" of user-generated content , 2010, CSCW '10.

[88]  Wenji Mao,et al.  Social Computing: From Social Informatics to Social Intelligence , 2007, IEEE Intell. Syst..

[89]  Darren R. Hardy Volunteered geographic information in Wikipedia , 2010 .

[90]  S. Elwood Volunteered geographic information: future research directions motivated by critical, participatory, and feminist GIS , 2008 .

[91]  Luca de Alfaro,et al.  A content-driven reputation system for the wikipedia , 2007, WWW '07.

[92]  Everett M. Rogers,et al.  Innovation Diffusion As a Spatial Process , 1967 .

[93]  Fernando Lera-López,et al.  The Spatial Distribution of the Internet in the European Union: Does Geographical Proximity Matter? , 2008 .

[94]  Josep Blat,et al.  Digital Footprinting: Uncovering Tourists with User-Generated Content , 2008, IEEE Pervasive Computing.

[95]  Don Fallis,et al.  Toward an epistemology of Wikipedia , 2008, J. Assoc. Inf. Sci. Technol..

[96]  Oded Nov,et al.  Information Sharing and Social Computing: Why, What, and Where? , 2009, Adv. Comput..

[97]  Wolfgang Nejdl,et al.  Extracting Semantics Relationships between Wikipedia Categories , 2006, SemWiki.

[98]  Heidi E. Buchanan,et al.  Collectivism vs. Individualism in a Wiki World: Librarians Respond to Jaron Lanier's Essay “Digital Maoism: The Hazards of the New Online Collectivism” , 2007 .

[99]  Oded Nov,et al.  Open source content contributors' response to free-riding: The effect of personality and context , 2008, Comput. Hum. Behav..

[100]  Robert G. Raskin,et al.  The NASA Digital Earth Testbed , 2000, GIS '00.

[101]  Oded Nov,et al.  What motivates Wikipedians? , 2007, CACM.

[102]  R. Sieber Public Participation Geographic Information Systems: A Literature Review and Framework , 2006 .

[103]  Martin Wattenberg,et al.  Proceedings of the 40th Hawaii International Conference on System Sciences- 2007 Talk Before You Type: Coordination in Wikipedia , 2022 .

[104]  Michael F. Goodchild,et al.  Scales of Cybergeography , 2008 .

[105]  Paolo Rosso,et al.  A comparison of methods for the automatic identification of locations in wikipedia , 2007, GIR '07.

[106]  Albert-László Barabási,et al.  Understanding individual human mobility patterns , 2008, Nature.