A data-synthesis-driven method for detecting and extracting vague cognitive regions

ABSTRACT Cognitive regions and places are notoriously difficult to represent in geographic information science and systems. The exact delineation of cognitive regions is challenging insofar as borders are vague, membership within the regions varies non-monotonically, and raters cannot be assumed to assess membership consistently and homogeneously. In a study published in this journal in 2014, researchers devised a novel grid-based task in which participants rated the membership of individual cells in a given region and contrasted this approach to a standard boundary-drawing task. Specifically, the authors assessed the vague cognitive regions of Northern California and Southern California. The boundary between these cognitive regions was found to have variable width, and region membership peaked not at the most northern or southern cells but at substantially less extreme latitudes. The authors thus concluded that region membership is about attitude, not just latitude. In the present work, we reproduce this study by approaching it from a computational fourth-paradigm perspective, i.e., by the synthesis of high volumes of heterogeneous data from various sources. We compare the regions which we identify to those from the human-participants study of 2014, identifying differences and commonalities. Our results show a significant positive correlation to those in the original study. Beyond the extracted regions themselves, we compare and contrast the empirical and analytical approaches of these two methods, one a conventional human-participants study and the other an application of increasingly popular data-synthesis-driven research methods in GIScience.

[1]  Wenwen Li,et al.  Constructing gazetteers from volunteered Big Geo-Data based on Hadoop , 2013, Comput. Environ. Urban Syst..

[2]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[3]  Yohei Ikawa,et al.  Location-based insights from the social web , 2013, WWW '13 Companion.

[4]  A. Zipf,et al.  Research on social media feeds – A GIScience perspective , 2016 .

[5]  Yu Liu,et al.  A point-set-based approximation for areal objects: A case study of representing localities , 2010, Comput. Environ. Urban Syst..

[6]  Maeve Duggan,et al.  Social Media Update 2016 , 2016 .

[7]  M. Goodchild Citizens as sensors: the world of volunteered geography , 2007 .

[8]  Max M. Louwerse,et al.  Representing Spatial Structure Through Maps and Language: Lord of the Rings Encodes the Spatial Structure of Middle Earth , 2012, Cogn. Sci..

[9]  Michael F. Goodchild,et al.  Where's Downtown?: Behavioral Methods for Determining Referents of Vague Spatial Queries , 2003 .

[10]  David A. Shamma,et al.  The New Data and New Challenges in Multimedia Research , 2015, ArXiv.

[11]  M. Kendall,et al.  The Problem of $m$ Rankings , 1939 .

[12]  Carsten Keßler,et al.  Bottom-Up Gazetteers: Learning from the Implicit Semantics of Geotags , 2009, GeoS.

[13]  Brian H. Spitzberg,et al.  Mapping social activities and concepts with social media (Twitter) and web search engines (Yahoo and Bing): a case study in 2012 US Presidential Election , 2013 .

[14]  Andrew U. Frank,et al.  Deriving the Geographic Footprint of Cognitive Regions , 2016, AGILE Conf..

[15]  Andrew U. Frank The Prevalence of Objects with Sharp Boundaries in GIS , 1995 .

[16]  M. Swaine,et al.  What is forest , 1981 .

[17]  Zeynep Tufekci,et al.  Big Questions for Social Media Big Data: Representativeness, Validity and Other Methodological Pitfalls , 2014, ICWSM.

[18]  Weiru Liu,et al.  A survey of location inference techniques on Twitter , 2015, J. Inf. Sci..

[19]  Achille C. Varzi,et al.  Fiat and Bona Fide Boundaries , 2000 .

[20]  F. P. Preparata,et al.  Convex hulls of finite sets of points in two and three dimensions , 1977, CACM.

[21]  M. Goodchild,et al.  Spatial, temporal, and socioeconomic patterns in the use of Twitter and Flickr , 2013 .

[22]  Roberto Casati,et al.  Parts and Places: The Structures of Spatial Representation , 1999 .

[23]  Patrick Weber,et al.  OpenStreetMap: User-Generated Street Maps , 2008, IEEE Pervasive Computing.

[24]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[25]  Jens Lehmann,et al.  DBpedia - A crystallization point for the Web of Data , 2009, J. Web Semant..

[26]  Anthony G. Cohn,et al.  The ‘Egg-Yolk’ Representation of Regions with Indeterminate Boundaries , 2020 .

[27]  Helen Couclelis,et al.  People Manipulate Objects (but Cultivate Fields): Beyond the Raster-Vector Debate in GIS , 1992, Spatio-Temporal Reasoning.

[28]  Kathleen Stewart,et al.  Barrier dynamics for GIS: a design pattern for geospatial barriers , 2015, Int. J. Geogr. Inf. Sci..

[29]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[30]  Michael F. Goodchild,et al.  Constructing places from spatial footprints , 2012, GEOCROWD '12.

[31]  F. Holtmeier,et al.  Mountain Timberlines—Ecology, Patchiness, And Dynamics , 2003 .

[32]  S. Aitken,et al.  Residents' Spatial Knowledge of Neighborhood Continuity and Form , 2010 .

[33]  Daniel A. Griffith,et al.  An Introduction to Scientific Research Methods in Geography and Environmental Studies, 2nd ed. by Daniel R. Montello and Paul C. Sutton. Thousand Oaks, CA: SAGE, 2013, 314 pp. , 2013 .

[34]  Jo Wood,et al.  Describing place through user generated content , 2011, First Monday.

[35]  Tony Hey,et al.  The Fourth Paradigm: Data-Intensive Scientific Discovery , 2009 .

[36]  Haosheng Huang,et al.  European Handbook of Crowdsourced Geographic Information , 2016 .

[37]  Maribel Yasmina Santos,et al.  Geospatial Data in a Changing World: Selected papers of the 19th AGILE Conference on Geographic Information Science , 2016, AGILE Conf..

[38]  David M. Mark,et al.  Ontology and Geographic Objects: An Empirical Study of Cognitive Categorization , 1999, COSIT.

[39]  James A. Hendler,et al.  Why the Data Train Needs Semantic Rails , 2015, AI Mag..

[40]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[41]  Antony Galton,et al.  On the Ontological Status of Geographical Boundaries , 2003, Foundations of Geographic Information Science.

[42]  John Krumm,et al.  Discovering points of interest from users’ map annotations , 2008 .

[43]  Hideo Joho,et al.  Spatially-Aware Information Retrieval on the Internet , 2022 .

[44]  D. R. Montello Scale in Geography , 2001 .

[45]  Krzysztof Janowicz,et al.  Thematic signatures for cleansing and enriching place-related linked data , 2015, Int. J. Geogr. Inf. Sci..

[46]  Ross Purves,et al.  Exploring place through user-generated content: Using Flickr tags to describe city cores , 2010, J. Spatial Inf. Sci..

[47]  Daniel R. Montello,et al.  An introduction to scientific research methods in geography & environmental studies , 2006 .

[48]  Clare Davies,et al.  User Needs and Implications for Modelling Vague Named Places , 2009, Spatial Cogn. Comput..

[49]  Daniel R. Montello,et al.  Vague cognitive regions in geography and geographic information science , 2014, Int. J. Geogr. Inf. Sci..

[50]  Antony Galton,et al.  Efficient generation of simple polygons for characterizing the shape of a set of points in the plane , 2008, Pattern Recognit..

[51]  Mark Gahegan,et al.  Frankenplace: Interactive Thematic Mapping for Ad Hoc Exploratory Search , 2015, WWW.

[52]  P. Burrough,et al.  Geographic Objects with Indeterminate Boundaries , 1996 .

[53]  Krzysztof Janowicz,et al.  On the Geo-Indicativeness of Non-Georeferenced Text , 2012, ICWSM.

[54]  Krzysztof Janowicz,et al.  Extracting and understanding urban areas of interest using geotagged photos , 2015, Comput. Environ. Urban Syst..

[55]  B. Bennett What is a Forest? On the Vagueness of Certain Geographic Concepts , 2001 .

[56]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[57]  Krzysztof Janowicz,et al.  Detecting Origin-Destination Mobility Flows From Geotagged Tweets in Greater Los Angeles Area , 2014 .

[58]  Krzysztof Janowicz,et al.  POI Pulse: A Multi-granular, Semantic Signature–Based Information Observatory for the Interactive Visualization of Big Geosocial Data , 2015, Cartogr. Int. J. Geogr. Inf. Geovisualization.

[59]  Christoph F. Eick,et al.  Creating Polygon Models for Spatial Clusters , 2014, ISMIS.