Archiving and Analyzing Tweets and Webpages with the DLRL Hadoop Cluster
暂无分享,去创建一个
Sunshin Lee Dept. of Computer Science, Virginia Tech Blacksburg, VA 24061 USA sslee777@vt.edu Edward A. Fox Dept. of Computer Science, Virginia Tech Blacksburg, VA 24061 USA fox@vt.edu ABSTRACT In the Integrated Digital Event Archive and Library (IDEAL) [1] project we research the next generation integration of digital libraries and event archiving. The project team has been collecting Internet information such as tweets and webpages related to crises or tragedies in addition to recovery and government/community events. This poster is about the Hadoop cluster in the Digital Library Research Laboratory (DLRL) of the Department of Computer Science, Virginia Tech, along with its use in archiving and analyzing tweets and webpages.
[1] Edward A. Fox,et al. A digital library for water main break identification and visualization , 2012, JCDL '12.
[2] Edward A. Fox,et al. Read between the lines: A Machine Learning Approach for Disambiguating the Geo-location of Tweets , 2015, JCDL.