Analysis and evaluation of unstructured data: text mining versus natural language processing

Nowadays, most of information saved in companies are as unstructured models. Retrieval and extraction of the information is essential works and importance in semantic web areas. Many of these requirements will be depend on the storage efficiency and unstructured data analysis. Merrill Lynch recently estimated that more than 80% of all potentially useful business information is unstructured data. The large number and complexity of unstructured data opens up many new possibilities for the analyst. We analyze both structured and unstructured data individually and collectively. Text mining and natural language processing are two techniques with their methods for knowledge discovery form textual context in documents. In this study, text mining and natural language techniques will be illustrated. The aim of this work comparison and evaluation the similarities and differences between text mining and natural language processing for extraction useful information via suitable themselves methods.

[1]  Vincent D. Blondel,et al.  Automatic discovery of similar words , 2004 .

[2]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[3]  Gang Wang,et al.  Research and Implement of Classification Algorithm on Web Text Mining , 2007, Third International Conference on Semantics, Knowledge and Grid (SKG 2007).

[4]  Gurpreet Singh Lehal,et al.  A Survey of Text Mining Techniques and Applications , 2009 .

[5]  Michael W. Berry,et al.  Survey of Text Mining: Clustering, Classification, and Retrieval , 2007 .

[6]  Anne Kao,et al.  Text mining and natural language processing: introduction for the special issue , 2005, SKDD.

[7]  Gang Wang,et al.  Research and Implement of Classification Algorithm on Web Text Mining , 2007 .

[8]  Tong Zhang,et al.  Text Mining: Predictive Methods for Analyzing Unstructured Information , 2004 .

[9]  Tai-hoon Kim,et al.  A Review on Natural Language Processing in Opinion Mining , 2010 .

[10]  Rui Liu,et al.  Chinese Text Classification Based on the BVB Model , 2008, 2008 Fourth International Conference on Semantics, Knowledge and Grid.

[11]  Yuhui Qiu,et al.  A Chinese Text Classification Approach Based on Semantic Web , 2008, 2008 Fourth International Conference on Semantics, Knowledge and Grid.

[12]  Ronen Feldman,et al.  Book Reviews: The Text Mining Handbook: Advanced Approaches to Analyzing Unstructured Data by Ronen Feldman and James Sanger , 2008, CL.

[13]  Ricardo A. Baeza-Yates Challenges in the Interaction of Information Retrieval and Natural Language Processing , 2004, CICLing.

[14]  S. Logeswari,et al.  A Survey on Text Mining in Clustering , 2011 .

[15]  Farhad Soleimanian Gharehchopogh Approach and review of user oriented interactive data mining , 2010, 2010 4th International Conference on Application of Information and Communication Technologies.

[16]  Catholijn M. Jonker,et al.  Agent Models and Different User Ontologies for an Electronic Market Place , 2004, Knowledge and Information Systems.

[17]  Mark D. Smucker,et al.  Information Retrieval , 2017, Lecture Notes in Computer Science.

[18]  Fred Popowich,et al.  Using text mining and natural language processing for health care claims processing , 2005, SKDD.

[19]  Yi Fang,et al.  Dynamic Service Replica on Distributed Data Mining Grid , 2008, 2008 International Conference on Computer Science and Software Engineering.

[20]  Mark Sanderson,et al.  Universities of Leeds, Sheffield and York http://eprints.whiterose.ac.uk/ , 2022 .

[21]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[22]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[23]  David Hung-Chang Du,et al.  Towards efficient search on unstructured data: an intelligent-storage approach , 2007, CIKM '07.

[24]  Alexander V. Smirnov,et al.  Ontology-driven intelligent service for configuration support in networked organizations , 2007, Knowledge and Information Systems.

[25]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[26]  Amit Kumar Goel,et al.  Managing Unstructured Data Using Agent Technology , 2009 .

[27]  Philip S. Yu,et al.  Guest Editorial: Text and Web Mining , 2003, Applied Intelligence.

[28]  Luo Shi-guang Chinese Text Classification Based on DCM , 2006 .