Twitter Information Extraction for Smart City

In Indonesia, Bandung is the second most active Twitter user, which means a lot of tweets have been shared among Bandung people on Twitter. Tweets can be used as a data source to explore information related to the city. One example is information related to traffic congestion, such as information of location, date, and time when the traffic congestion happened. In this study, we proposed a method to filter the tweets related to traffic congestion in Bandung and to extract the information of location, time, date and image (if any). SVM with several variations of the weighting and the selection of features is used for filtering process. The results showed that the greatest accuracy rate is 83% use Binary weighting method in top-2000 features. Meanwhile, information extraction process carried out by a rule-based approach, gave satisfactory results, around 98%-100% for the extraction of date, time and URL. However, the extraction of location information only gave accuracy of about 62%. It was caused by OOV (Out of Vocabulary) and OOR (Out of Rules).

[1]  Tao Xu,et al.  SMART-CITY: Problematics, techniques and case studies , 2012, 2012 8th International Conference on Computing Technology and Information Management (NCM and ICNIT).

[2]  Lilly Suriani Affendey,et al.  Semantically factoid question answering using fuzzy SVM Named Entity Recognition , 2008, 2008 International Symposium on Information Technology.

[3]  James Purnama,et al.  Traffic Condition Information Extraction & Visualization from Social Media Twitter for Android Mobile Application , 2011, Proceedings of the 2011 International Conference on Electrical Engineering and Informatics.

[4]  Wasan Pattara-Atikom,et al.  Social-based traffic information extraction and classification , 2011, 2011 11th International Conference on ITS Telecommunications.

[5]  Y. Matsuo,et al.  Real-time event extraction for driving information from social sensors , 2012, 2012 IEEE International Conference on Cyber Technology in Automation, Control, and Intelligent Systems (CYBER).

[6]  I.M. Markovic,et al.  Named entity recognition and classification using context Hidden Markov Model , 2008, 2008 9th Symposium on Neural Network Applications in Electrical Engineering.

[7]  Masayu Leylia Khodra,et al.  Optimal path finding based on traffic information extraction from Twitter , 2013, International Conference on ICT for Smart Society.

[8]  Zhu Zhen-fang,et al.  A Feature Selection Method based on Improved TFIDF , 2008, 2008 Third International Conference on Pervasive Computing and Applications.

[9]  Shi Lu The Smart City's systematic application and implementation in China , 2011, 2011 International Conference on Business Management and Electronic Information.

[10]  A. Mamat,et al.  A New Fuzzy Support Vector Machine Method for Named Entity Recognition , 2008, 2008 International Conference on Computer Science and Information Technology.

[11]  Aoying Zhou,et al.  An information theoretic approach to sentiment polarity classification , 2012, WebQuality '12.

[12]  Ming Zhou,et al.  Recognizing Named Entities in Tweets , 2011, ACL.

[13]  Xia Han,et al.  The Method of Medical Named Entity Recognition Based on Semantic Model and Improved SVM-KNN Algorithm , 2011, 2011 Seventh International Conference on Semantics, Knowledge and Grids.