Morpheme Segmentation for Highly Agglutinative Tamil Language by Means of Unsupervised Learning

To understand human language is one of the major challenges in the field of intelligent information systems. Morphological processing is the first step to be done in many Natural language processing applications. This task becomes crucial for morphological rich languages. This paper illustrates the importance of unsupervised morphological segmentation algorithms for the problem of morpheme boundary detection for Tamil language which are highly inflectional and agglutinative in morphology. This paper serves as ground work to represent the various methods and the comparative study among the selection of the algorithms which is based on highly agglutinative languages like Kannada, Finnish and Bengali. The prime advantages of these algorithms elevate to the efficient morphological processing of Tamil language