Text mining from biomedical domain using a full parser

Text mining is vital for knowledge cultivation, keeping this in perspective we have focused on developing a system which uses a full parser for analyzing the text, grammar towards the biomedical arena. We proposed a preprocessor to overcome the shortcomings of full parsing and modules to handle the partial outcome. The developed system, not only has the viability to be maintained easily, but also can adapt itself for a particular domain. In the primary experiment, out of 131 argument structures extracted from 96 sentences, 32 were extractable, 33 with ambiguity and the remaining 66 (non-extractable) for which partial result was determined. The work produced better result than the other full parser with reduced count of failure in extraction and ambiguity.

[1]  Asif Ekbal,et al.  Feature selection for event extraction in biomedical text , 2015, 2015 Eighth International Conference on Advances in Pattern Recognition (ICAPR).

[2]  Yasunori Yamamoto,et al.  Automatic Construction of Knowledge Base from Biological Papers , 1997, ISMB.

[3]  Karin M. Verspoor,et al.  Approaches to verb subcategorization for biomedicine , 2013, J. Biomed. Informatics.

[4]  Xiao Sun,et al.  Extraction of Biomedical Events Related to Disease Based on Deep Parsing , 2011 .

[5]  Dina Demner-Fushman,et al.  Biomedical Text Mining: A Survey of Recent Progress , 2012, Mining Text Data.

[6]  Koichi Yamada,et al.  Protein named entity classification with probabilistic features derived from GENIA corpus and MEDLINE , 2014, 2014 Joint 7th International Conference on Soft Computing and Intelligent Systems (SCIS) and 15th International Symposium on Advanced Intelligent Systems (ISIS).

[7]  Karin M. Verspoor,et al.  Biomedical Text Mining: State-of-the-Art, Open Problems and Future Challenges , 2014, Interactive Knowledge Discovery and Data Mining in Biomedical Informatics.

[8]  Hwee Tou Ng,et al.  A PDTB-styled end-to-end discourse parser , 2012, Natural Language Engineering.

[9]  W. Alkema,et al.  Application of text mining in the biomedical domain. , 2015, Methods.

[10]  Richard Johansson,et al.  Training Parsers on Incompatible Treebanks , 2013, NAACL.

[11]  Rashmi Agrawal,et al.  A Detailed Study on Text Mining Techniques , 2013 .

[12]  Andrew Y. Ng,et al.  Parsing with Compositional Vector Grammars , 2013, ACL.

[13]  Jun'ichi Tsujii,et al.  An HPSG parser with CFG filtering , 2000, Nat. Lang. Eng..

[14]  Mark Steedman,et al.  Example Selection for Bootstrapping Statistical Parsers , 2003, NAACL.

[15]  Jonathan D. G. Jones,et al.  NLR-parser: rapid annotation of plant NLR complements , 2015, Bioinform..

[16]  Jun'ichi Tsujii,et al.  Event Extraction from Biomedical Papers Using a Full Parser , 2000, Pacific Symposium on Biocomputing.

[17]  Dietrich Rebholz-Schuhmann,et al.  The BioLexicon: a large-scale terminological resource for biomedical text mining , 2011, BMC Bioinformatics.

[18]  Yue Zhang,et al.  Distributed Feature Representations for Dependency Parsing , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[19]  Park,et al.  Identifying the Interaction between Genes and Gene Products Based on Frequently Seen Verbs in Medline Abstracts. , 1998, Genome informatics. Workshop on Genome Informatics.

[20]  Joakim Nivre,et al.  A statistical model for grammar mapping , 2015, Natural Language Engineering.

[21]  Montserrat Marimon,et al.  Automatic Selection of HPSG-Parsed Sentences for Treebank Construction , 2014, Computational Linguistics.

[22]  Tom M. Mitchell,et al.  Joint Syntactic and Semantic Parsing with Combinatory Categorial Grammar , 2014, ACL.

[23]  Paloma Martínez,et al.  A linguistic rule-based approach to extract drug-drug interactions from pharmacological documents , 2011, BMC Bioinformatics.

[24]  Rada Mihalcea,et al.  Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , 2015, North American Chapter of the Association for Computational Linguistics.

[25]  Vera Lúcia Strube de Lima,et al.  Open information extraction based on lexical semantics , 2015, Journal of the Brazilian Computer Society.