Extracting Actions with Improved Part of Speech Tagging for Social Networking Texts

With the growing interests in social networking, the interaction of social actors evolved to a source of knowledge in which it become possible to perform context aware reasoning. The information extraction from social networking specially Twitter and Facebook is on of the problem in this area. To extract text from social networking, we need several lexical features and large scale word clustering. We attempt to expand existing tokenizer and to develop our own tagger in order to support the incorrect words currently in existence in Facebook and Twitter. Our goal in this work is to benefit of the lexical features developed for Twitter and online conversational text in previous works, to design and to develop an extraction model for constructing a huge knowledge based on actions.

[1]  Brendan T. O'Connor,et al.  TweetMotif: Exploratory Search and Topic Summarization for Twitter , 2010, ICWSM.

[2]  Doug Downey,et al.  KnowItNow: Fast, Scalable Information Extraction from the Web , 2005, HLT.

[3]  Oren Etzioni,et al.  Strategies for lifelong knowledge extraction from the web , 2007, K-CAP '07.

[4]  Jacob Eisenstein,et al.  What to do about bad language on the internet , 2013, NAACL.

[5]  Sujith Ravi,et al.  Using structured text for large-scale attribute extraction , 2008, CIKM '08.

[6]  Gerhard Weikum,et al.  YAGO: A Large Ontology from Wikipedia and WordNet , 2008, J. Web Semant..

[7]  Gerhard Weikum,et al.  Combining linguistic and statistical analysis to extract relations from web documents , 2006, KDD '06.

[8]  Marius Pasca,et al.  Low-Cost Supervision for Multiple-Source Attribute Extraction , 2009, CICLing.

[9]  Oren Etzioni,et al.  Open Information Extraction from the Web , 2007, CACM.

[10]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[11]  Chris Callison-Burch,et al.  Extracting Lexically Divergent Paraphrases from Twitter , 2014, TACL.

[12]  Brendan T. O'Connor,et al.  Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments , 2010, ACL.

[13]  Hiroyoshi Morita,et al.  A Collective Intelligence Based Approach to Business-to-Business E-Marketplaces , 2011 .

[14]  Sung-Hyon Myaeng,et al.  Automatic construction of a large-scale situation ontology by mining how-to instructions from the web , 2010, J. Web Semant..

[15]  Jakob Uszkoreit,et al.  Cross-lingual Word Clusters for Direct Transfer of Linguistic Structure , 2012, NAACL.