FISER: A Feature‐Based Detection System for Person Interactions

Discovering the interactions between the persons mentioned in a set of topic documents can help readers construct the background of the topic and facilitate document comprehension. To discover person interactions, we need a detection method that can identify text segments containing information about the interactions. Information extraction algorithms then analyze the segments to extract interaction tuples and construct a network of person interaction. In this article, we define interaction detection as a classification problem. The proposed interaction detection method, called feature‐based interactive segment recognizer (FISER), exploits 19 features covering syntactic, context‐dependent, and semantic information in text to detect intra‐clausal and inter‐clausal interactive segments in topic documents. Empirical evaluations demonstrate that FISER outperformed many well‐known relation extraction and protein–protein interaction detection methods on identifying interactive segments in topic documents. In addition, the precision, recall, and F1‐score of the best feature combination are 72.9%, 55.8%, and 63.2%, respectively.

[1]  Alessandro Moschitti,et al.  A Study on Convolution Kernels for Shallow Statistic Parsing , 2004, ACL.

[2]  Glenn M. Vernon Human interaction : an introduction to sociology , 1966 .

[3]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[4]  Aron Culotta,et al.  Dependency Tree Kernels for Relation Extraction , 2004, ACL.

[5]  Avinash J. Agrawal,et al.  Hybrid Approach to Pronominal Anaphora Resolution in English Newspaper Text , 2015 .

[6]  Bing Liu,et al.  Opinion spam and analysis , 2008, WSDM '08.

[7]  Dmitry Zelenko,et al.  Kernel Methods for Relation Extraction , 2002, J. Mach. Learn. Res..

[8]  Oren Etzioni,et al.  Chinese Open Relation Extraction for Knowledge Acquisition , 2014, EACL.

[9]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[10]  Guodong Zhou,et al.  Tree kernel-based semantic relation extraction with rich syntactic and semantic information , 2010, Inf. Sci..

[11]  Mi-Young Kim Detection of Gene Interactions Based on Syntactic Relations , 2007, Journal of biomedicine & biotechnology.

[12]  Shu-Ling Huang,et al.  E-HowNet : the Expansion of HowNet , 2008 .

[13]  Nanda Kambhatla,et al.  Combining Lexical, Syntactic, and Semantic Features with Maximum Entropy Models for Information Extraction , 2004, ACL.

[14]  James Allan,et al.  Finding and linking incidents in news , 2007, CIKM '07.

[15]  Toru Hirano,et al.  Detecting Semantic Relations between Named Entities in Text Using Contextual Features , 2007, ACL.

[16]  Andrew McCallum,et al.  Using Maximum Entropy for Text Classification , 1999 .

[17]  Haim Levkowitz,et al.  Introduction to information retrieval (IR) , 2008 .

[18]  J. Darroch,et al.  Generalized Iterative Scaling for Log-Linear Models , 1972 .

[19]  Oren Etzioni,et al.  Open Information Extraction: The Second Generation , 2011, IJCAI.

[20]  Oren Etzioni,et al.  Open Information Extraction from the Web , 2007, CACM.

[21]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[22]  Patrick Pantel,et al.  Espresso: Leveraging Generic Patterns for Automatically Harvesting Semantic Relations , 2006, ACL.

[23]  Vasileios Hatzivassiloglou,et al.  Learning anchor verbs for biological interaction patterns from published text articles , 2002, Int. J. Medical Informatics.

[24]  Toshihisa Takagi,et al.  Automated extraction of information on protein-protein interactions from the biological literature , 2001, Bioinform..

[25]  Hwee Tou Ng,et al.  A maximum entropy approach to information extraction from semi-structured and free text , 2002, AAAI/IAAI.

[26]  Furu Wei,et al.  A Novel Feature-based Approach to Chinese Entity Relation Extraction , 2008, ACL.

[27]  Oren Etzioni,et al.  Identifying Relations for Open Information Extraction , 2011, EMNLP.

[28]  Byoung-Tak Zhang,et al.  A Tree Kernel-Based Method for Protein-Protein Interaction Mining from Biomedical Literature , 2006, KDLL.

[29]  Toru Hirano,et al.  Recognizing Relation Expression between Named Entities based on Inherent and Context-dependent Features of Relational words , 2010, COLING.

[30]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[31]  Jian Su,et al.  Protein-Protein Interaction Extraction: A Supervised Learning Approach} , 2005 .

[32]  Jun'ichi Tsujii,et al.  Evaluating contributions of natural language parsers to protein–protein interaction extraction , 2008, Bioinform..

[33]  Chien Chin Chen,et al.  TSCAN: A Content Anatomy Approach to Temporal Topic Summarization , 2012, IEEE Transactions on Knowledge and Data Engineering.

[34]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[35]  Ramesh Nallapati,et al.  Event threading within news topics , 2004, CIKM '04.

[36]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[37]  Luis Gravano,et al.  Snowball: extracting relations from large plain-text collections , 2000, DL '00.

[38]  Jean Carletta,et al.  Assessing Agreement on Classification Tasks: The Kappa Statistic , 1996, CL.

[39]  Oren Etzioni,et al.  Semantic Role Labeling for Open Information Extraction , 2010, HLT-NAACL 2010.

[40]  Siddhartha Jonnalagadda,et al.  Coreference analysis in clinical notes: a multi-pass sieve with alternate anaphora resolution modules , 2012, J. Am. Medical Informatics Assoc..

[41]  Jian Su,et al.  A Composite Kernel to Extract Relations between Entities with Both Flat and Structured Features , 2006, ACL.

[42]  John D. Lafferty,et al.  Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[43]  Fang Kong,et al.  Exploiting Constituent Dependencies for Tree Kernel-Based Semantic Relation Extraction , 2008, COLING.

[44]  Oren Etzioni,et al.  The Tradeoffs Between Open and Traditional Relation Extraction , 2008, ACL.

[45]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[46]  ZhouGuodong,et al.  Tree kernel-based protein-protein interaction extraction from biomedical literature , 2012 .

[47]  Bo Zhang,et al.  StatSnowball: a statistical approach to extracting entity relationships , 2009, WWW '09.

[48]  Charles X. Ling,et al.  Data Mining for Direct Marketing: Problems and Solutions , 1998, KDD.

[49]  Yuji Matsumoto,et al.  Chinese Unknown Word Identification Using Character-based Tagging and Chunking , 2003, ACL.