Dissecting the Ambiguity of FMA Concept Names Using Taxonomy and Partonomy Structural Information

The complex inner structures of concept names in the Foundational Model of Anatomy (FMA) remain an obstacle for further analyzing the ontology using lexical methods. A very common problem is the ambiguity lying in names with the sometimes multiple occurrences of the preposition “of.” In this paper, we propose an automatic method to help disambiguating FMA terms by leveraging the taxonomy and partonomy information. If a sub-phrase of a concept name also appears in its parents, it is likely to occur as a sub-tree in its parse tree, hence should be parsed as such. We classified all the concept names with a single occurrence of the preposition “of” by the appearances of their sub-phrases in the parent names using three test suites. Results show that more than 90% of them can be provided with useful information to assist their correct parsing.

[1]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[2]  Amar K. Das,et al.  Unsupervised Method for Extracting Machine Understandable Medical Knowledge from a Large Free Text Collection , 2009, AMIA.

[3]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[4]  Olivier Bodenreider,et al.  Assessing the consistency of a biomedical terminology through lexical knowledge , 2002, Int. J. Medical Informatics.

[5]  Cliff Joslyn,et al.  An Analysis of Multi-type Relational Interactions in FMA Using Graph Motifs with Disjointness Constraints , 2012, AMIA.

[6]  Chunhua Weng,et al.  A review of auditing methods applied to the content of controlled biomedical terminologies , 2009, J. Biomed. Informatics.

[7]  Olivier Bodenreider,et al.  Aligning Representations of Anatomy using Lexical and Structural Methods , 2003, AMIA.

[8]  James Geller,et al.  Special Issue on Auditing of Terminologies , 2009, J. Biomed. Informatics.

[9]  James R. Curran,et al.  Parsing Noun Phrases in the Penn Treebank , 2011, Computational Linguistics.

[10]  Guo-Qiang Zhang Large-Scale, Exhaustive Lattice-Based Structural Auditing of SNOMED CT , 2010, KSEM.

[11]  K. Bretonnel Cohen,et al.  The Compositional Structure of Gene Ontology Terms , 2003, Pacific Symposium on Biocomputing.

[12]  Cornelius Rosse,et al.  A Reference Ontology for Bioinformatics: The Foundational Model of Anatomy , 2003 .

[13]  James R. Curran,et al.  Adding Noun Phrase Structure to the Penn Treebank , 2007, ACL.

[14]  Amar K. Das,et al.  Unsupervised Method for Automatic Construction of a Disease Dictionary from a Large Free Text Collection , 2008, AMIA.