A Hybrid Model For Phrase Chunking Employing Artificial Immunity System And Rule Based Methods

Natural language Understanding (NLU), an important field of Artificial Intelligence (AI) is concerned with the speech and language understanding between human and computer. Understanding language means knowing what concept a word or phrase stands for and how to link them to form meaningful sentence. Identification of phrases or phrase chunking is an important step in natural language understanding (NLU). Chunker identifies and divides sentences into syntactically correlated word groups. Question Answering (QA) systems, another important application of Artificial Intelligence (AI) mostly requires retrieval of nouns or noun phrases as answers to the questions raised by the users. Also Chunking is an important preprocessing step in full parsing. Due to high ambiguity of natural language, exact parsing of text may become very complex. This ambiguity may be partially resolved by using chunking as an intermediate step. To the best of our knowledge no known work or tag set is available for phrase chunking in Malayalam. To separate the chunks in a document it must be labeled with parts-ofspeech (POS) tags. POS Tagging is a difficult task in Malayalam as it is a complex and compounding language. In this paper we describe the application of artificial immunity system (AIS) for chunking which is implemented and obtained an accurate output with 96% precision and 93% recall. This system is tested on corpuses collected from reputed news papers and magazines. These corpuses contained documents from five different domains such as sports, health, agriculture, science and politics and each document contained sentences –simple, compound, complex-of various levels of complexity. POS tag set with 52 tags is developed for preparing the tagged corpus for Malayalam. The phrase tag set contains 20 phrase tags.

[1]  Akshat Kumar,et al.  An Artificial Immune System Based Approach for English Grammar Checking , 2007, ICARIS.

[2]  Frank Wallhoff,et al.  Natural Language Understanding by Combining Statistical Methods and Extended Context-Free Grammars , 2008, DAGM-Symposium.

[3]  Hiroshi Echizen-ya,et al.  Automatic Evaluation Method for Machine Translation Using Noun-Phrase Chunking , 2010, ACL.

[4]  Cícero Nogueira dos Santos,et al.  Phrase Chunking Using Entropy Guided Transformation Learning , 2008, ACL.

[5]  Dong Hwa Kim,et al.  Neural networks control by immune network algorithm based auto-weight function tuning , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[6]  Mehrnoush Shamsfard,et al.  Developing a persian chunker using a hybrid approach , 2009, 2009 International Multiconference on Computer Science and Information Technology.

[7]  Steven Abney,et al.  Parsing By Chunks , 1991 .

[8]  Alan S. Perelson,et al.  Self-nonself discrimination in a computer , 1994, Proceedings of 1994 IEEE Computer Society Symposium on Research in Security and Privacy.

[9]  S. B. Nair,et al.  An Artificial Immune System for a Multi Agent Robotics System , 2007 .

[10]  Rubita Sudirman,et al.  Swarm Negative Selection Algorithm for Electroencephalogram Signals Classification , 2009 .

[11]  Sophia Ananiadou,et al.  Fast Full Parsing by Linear-Chain Conditional Random Fields , 2009, EACL.

[12]  Mitchell P. Marcus,et al.  Text Chunking using Transformation-Based Learning , 1995, VLC@ACL.

[13]  Simon M. Garrett,et al.  How Do We Evaluate Artificial Immune Systems? , 2005, Evolutionary Computation.

[14]  Fernando José Von Zuben,et al.  Learning and optimization using the clonal selection principle , 2002, IEEE Trans. Evol. Comput..

[15]  Wei Wang,et al.  An Adaptive Clonal Selection Algorithm for Edge Linking Problem , 2009 .