Extracting Meronymy Relationships from Domain-Specific, Textual Corporate Databases

Various techniques for learning meronymy relationships from open-domain corpora exist. However, extracting meronymy relationships from domain-specific, textual corporate databases has been overlooked, despite numerous application opportunities particularly in domains like product development and/or customer service. These domains also pose new scientific challenges, such as the absence of elaborate knowledge resources, compromising the performance of supervised meronymy-learning algorithms. Furthermore, the domain-specific terminology of corporate texts makes it difficult to select appropriate seeds for minimally-supervised meronymy-learning algorithms. To address these issues, we develop and present a principled approach to extract accurate meronymy relationships from textual databases of product development and/or customer service organizations by leveraging on reliable meronymy lexico-syntactic patterns harvested from an open-domain corpus. Evaluations on real-life corporate databases indicate that our technique extracts precise meronymy relationships that provide valuable operational insights on causes of product failures and customer dissatisfaction. Our results also reveal that the types of some of the domain-specific meronymy relationships, extracted from the corporate data, cannot be conclusively and unambiguously classified under wellknown taxonomies of relationships.

[1]  C. Maria Keet,et al.  Representing and reasoning over a taxonomy of part-whole relations , 2008 .

[2]  Eugene Charniak,et al.  Finding Parts in Very Large Corpora , 1999, ACL.

[3]  Kam-Fai Wong,et al.  Natural Language Processing - IJCNLP 2005, Second International Joint Conference, Jeju Island, Korea, October 11-13, 2005, Proceedings , 2005, IJCNLP.

[4]  Ian H. Witten,et al.  Mining Meaning from Wikipedia , 2008, Int. J. Hum. Comput. Stud..

[5]  Dan I. Moldovan,et al.  Learning Semantic Constraints for the Automatic Discovery of Part-Whole Relations , 2003, NAACL.

[6]  Nicola Guarino,et al.  Open Problems with Part-Whole Relations , 1996, Description Logics.

[7]  Valentin Jijkoun,et al.  Information Extraction for Question Answering: Improving Recall Through Syntactic Patterns , 2004, COLING.

[8]  Dean Allemang,et al.  The Semantic Web - ISWC 2006, 5th International Semantic Web Conference, ISWC 2006, Athens, GA, USA, November 5-9, 2006, Proceedings , 2006, SEMWEB.

[9]  Willem Robert van Hage,et al.  A Method for Learning Part-Whole Relations , 2006, International Semantic Web Conference.

[10]  Martha Walton Evens Relational Models of the Lexicon: Representing Knowledge in Semantic Networks , 2009 .

[11]  Douglas Herrmann,et al.  A Taxonomy of Part-Whole Relations , 1987, Cogn. Sci..

[12]  Hans Wortmann,et al.  Textractor: A Framework for Extracting Relevant Domain Concepts from Irregular Corporate Textual Datasets , 2010, BIS.

[13]  Madelyn Anne Iris,et al.  Problems of the part-whole relation , 1989 .

[14]  Patrick Pantel,et al.  Espresso: Leveraging Generic Patterns for Automatically Harvesting Semantic Relations , 2006, ACL.

[15]  Dietrich Klakow,et al.  Exploring Syntactic Relation Patterns for Question Answering , 2005, IJCNLP.

[16]  Slava M. Katz,et al.  Technical terminology: some linguistic properties and an algorithm for identification in text , 1995, Natural Language Engineering.

[17]  Dan I. Moldovan,et al.  Automatic Discovery of Part-Whole Relations , 2006, CL.