Spatial Statistics with Three-Tier Breadth First Search for Analyzing Social Geocontents

The objective of this paper is to clear out the relation ship between user's contexts and really used words in order to realize the context-aware Japanese text input method editor. We propose two spatial analyzing methods for finding location-dependent words among the huge Japanese data with geographical information. In this paper, we analyze a half million tweets gathered by our system since Dec. 2009. First, we analyze the standard deviation of latitude and longitude, which shows variation level. It is very simple way, but it can't find out the keywords that depend on several locations. For example, famous department stores distributed all over Japan have a large standard deviation, but they will depend on each location. Therefore, we propose three-tier breadth first search, where the searching area is divided into some square mesh, and we extract the area which include tweets more than average of upper area. In addition, we re-divide the extracted areas into more small areas. Our method can extract some locations for one keyword.

[1]  Akira Fukuda,et al.  Relationship analysis between user's contexts and real input words through Twitter , 2010, 2010 IEEE Globecom Workshops.

[2]  Akira Fukuda,et al.  Network-Based Context-Aware Input Method Editor , 2010, 2010 Sixth International Conference on Networking and Services.