A personalized query expansion approach for engineering document retrieval

Engineers create engineering documents with their own terminologies, and want to search existing engineering documents quickly and accurately during a product development process. Keyword-based search methods have been widely used due to their ease of use, but their search accuracy has been often problematic because of the semantic ambiguity of terminologies in engineering documents and queries. The semantic ambiguity can be alleviated by using a domain ontology. Also, if queries are expanded to incorporate the engineer’s personalized information needs, the accuracy of the search result would be improved. Therefore, we propose a framework to search engineering documents with less semantic ambiguity and more focus on each engineer’s personalized information needs. The framework includes four processes: (1) developing a domain ontology, (2) indexing engineering documents, (3) learning user profiles, and (4) performing personalized query expansion and retrieval. A domain ontology is developed based on product structure information and engineering documents. Using the domain ontology, terminologies in documents are disambiguated and indexed. Also, a user profile is generated from the domain ontology. By user profile learning, user’s interests are captured from the relevant documents. During a personalized query expansion process, the learned user profile is used to reflect user’s interests. Simultaneously, user’s searching intent, which is implicitly inferred from the user’s task context, is also considered. To retrieve relevant documents, an expanded query in which both user’s interests and intents are reflected is then matched against the document collection. The experimental results show that the proposed approach can substantially outperform both the keyword-based approach and the existing query expansion method in retrieving engineering documents. Reflecting a user’s information needs precisely has been identified to be the most important factor underlying this notable improvement.

[1]  Dennis McLeod,et al.  Retrieval effectiveness of an ontology-based model for information selection , 2004, The VLDB Journal.

[2]  Shang-Hsien Hsieh,et al.  A concept-based information retrieval approach for engineering domain-specific technical documents , 2012, Adv. Eng. Informatics.

[3]  Otis Gospodnetic,et al.  Lucene in Action , 2004 .

[4]  Steven J. Fenves,et al.  CPM 2: A Revised Core Product Model for Representing Design Information , 2004 .

[5]  Dirk Schaefer,et al.  A semantic file system for integrated product data management , 2011, Adv. Eng. Informatics.

[6]  Steven J. Fenves,et al.  A Semantic Product Modeling Framework and Its Application to Behavior Evaluation , 2012, IEEE Transactions on Automation Science and Engineering.

[7]  Dong Zhou,et al.  Improving search via personalized query expansion using social media , 2012, Information Retrieval.

[8]  Luis M. de Campos,et al.  Ranking Structured Documents Using Utility Theory in the Bayesian Network Retrieval Model , 2003, SPIRE.

[9]  Amanda Spink,et al.  Searching the Web: the public and their queries , 2001 .

[10]  P. Smith,et al.  A review of ontology based query expansion , 2007, Inf. Process. Manag..

[11]  Elena García Barriocanal,et al.  An empirical analysis of ontology-based query expansion for learning resource searches using MERLOT and the Gene ontology , 2011, Knowl. Based Syst..

[12]  Bofeng Zhang,et al.  An Ontology-Based Methodology for Semantic Expansion Search , 2008, 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery.

[13]  Xiaojian Li,et al.  Personalized Query Expansion Based on Semantic User Model in E-learning System , 2009, 2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery.

[14]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[15]  Gary Geunbae Lee,et al.  Syllable-Pattern-Based Unknown-Morpheme Segmentation and Estimation for Hybrid Part-of-Speech Tagging of Korean , 2002, Computational Linguistics.

[16]  Carol Tenopir,et al.  Design engineers and technical professionals at work: Observing information usage in the workplace , 2009 .

[17]  Kun Hua Tsai,et al.  A practical ontology query expansion algorithm for semantic-aware learning objects retrieval , 2008, Comput. Educ..

[18]  Chris A. McMahon,et al.  Characterising the requirements of engineering information systems , 2004, Int. J. Inf. Manag..

[19]  Stuart E. Middleton,et al.  Ontological user profiling in recommender systems , 2004, TOIS.

[20]  Jane Greenberg,et al.  Using BM25F for semantic search , 2010, SEMSEARCH '10.

[21]  Steve Culley,et al.  An analysis of the content of technical information used by engineering designers , 2000 .

[22]  Ivica Crnkovic,et al.  Implementing and integrating product data management and software configuration management , 2003 .

[23]  Ah-Hwee Tan,et al.  Learning and inferencing in user ontology for personalized Semantic Web search , 2009, Inf. Sci..

[24]  Daniela Petrelli,et al.  Highly focused document retrieval in aerospace engineering: User interaction design and evaluation , 2011, Aslib Proc..

[25]  Dave Stewart,et al.  Waypoint: An Integrated Search and Retrieval System for Engineering Documents , 2004, J. Comput. Inf. Sci. Eng..

[26]  Chris A. McMahon,et al.  An approach for the capture of context-dependent document relationships extracted from Bayesian analysis of users' interactions with information , 2007, Information Retrieval.

[27]  Shaofeng Liu,et al.  A review of structured document retrieval (SDR) technology to improve information access performance in engineering document management , 2008, Comput. Ind..

[28]  Guillaume Ducellier,et al.  PDM system implementation based on UML , 2006, Math. Comput. Simul..

[29]  Paul-Alexandru Chirita,et al.  Personalized query expansion for the web , 2007, SIGIR.

[30]  Mohand Boughanem,et al.  Evaluation of contextual information retrieval effectiveness: overview of issues and research , 2010, Knowledge and Information Systems.

[31]  Victor Raskin,et al.  Developing Engineering Ontology for Information Retrieval , 2008, J. Comput. Inf. Sci. Eng..

[32]  Yannis Avrithis,et al.  Personalized Content Retrieval in Context Using Ontological Knowledge , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[33]  Steven J. Fenves,et al.  A product information modeling framework for product lifecycle management , 2005, Comput. Aided Des..

[34]  Karl T. Ulrich,et al.  Product Design and Development , 1995 .

[35]  Karthik Ramani,et al.  Ontology-based design information extraction and retrieval , 2007, Artificial Intelligence for Engineering Design, Analysis and Manufacturing.

[36]  Mark Sanderson,et al.  A Study of User Interaction with a Concept-Based Interactive Query Expansion Support Tool , 2004, ECIR.

[37]  Ken M. Wallace,et al.  Identifying and supporting the knowledge needs of novice designers within the aerospace industry , 2004 .