A novel method to summarize and retrieve text documents using text feature extraction based on ontology

Data retrieval is a key process of acquiring information as per requirement. Now days, the necessity of proper information has increased. The most basic tools which provide this service are browser. It traverses the data as per user's query and gives the search results of all related information. Hence, it becomes a time consuming process to find required information. In this paper, the focus is done over content based data mining using ontology and text feature extraction. Content based data mining process focuses on domain of the data. Ontology, itself is a domain based data set information system that will help to achieve required data retrieval in a more appropriate way. The proposed systemuses k means clustering algorithm for creation of flat clusters. Flat clusters are the primary classification or clusters of data that are used for various further processing. For more appropriate data retrieval, this system uses text feature extraction algorithm. This algorithm will help to reduce the noisy data from data sets. A noise free data will help to perform better data retrieval process.

[1]  Amrita A. Manjrekar,et al.  HFRECCA for clustering of text data from travel guide articles , 2014, 2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI).

[2]  Boi Faltings,et al.  Using hierarchical clustering for learning theontologies used in recommendation systems , 2007, KDD '07.

[3]  Gang Liu,et al.  A WordNet-based Semantic Similarity Measure Enhanced by Internet-based Knowledge , 2011, SEKE.

[4]  Masoumeh Zareapoor,et al.  Feature Extraction or Feature Selection for Text Classification: A Case Study on Phishing Email Detection , 2015 .

[5]  Ling Bian,et al.  Using Ontologies and Formal Concept Analysis to Integrate Heterogeneous Tourism Information , 2015, IEEE Transactions on Emerging Topics in Computing.

[6]  Mark A. Musen,et al.  An Algorithm for Merging and Aligning Ontologies: Automation and Tool Support , 1999 .

[7]  Adriana Maria C. M. Figueiredo,et al.  Improving Access to Software Architecture Knowledge An Ontology-based Search Approach , 2013 .

[8]  Zahir Tari,et al.  A Survey of Clustering Algorithms for Big Data: Taxonomy and Empirical Analysis , 2014, IEEE Transactions on Emerging Topics in Computing.

[9]  Tao Li,et al.  An Empirical Study of Ontology-Based Multi-Document Summarization in Disaster Management , 2014, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[10]  Gurpreet Kaur,et al.  Feature Extraction techniques for Classification of Emotions in Speech Signals , 2014 .

[11]  Megha Garg,et al.  Development of ontology from Indian agricultural e-governance data using IndoWordNet: a semantic web approach , 2015, J. Knowl. Manag..

[12]  George Karypis,et al.  Hierarchical Clustering Algorithms for Document Datasets , 2005, Data Mining and Knowledge Discovery.

[13]  Faten Kharbat,et al.  Building Ontology from Knowledge Base Systems , 2008 .

[14]  George Forman,et al.  Extremely fast text feature extraction for classification and indexing , 2008, CIKM '08.

[15]  Steffen Staab,et al.  Ontologies improve text document clustering , 2003, Third IEEE International Conference on Data Mining.

[16]  Z. University,et al.  Building Ontology from Knowledge Base Systems , 2008 .

[17]  Jean Vincent Fonou Dombeu,et al.  Combining Ontology Development Methodologies and Semantic Web Platforms for E-government Domain Ontology Development , 2011, ArXiv.