ENHANCED CONFIX STRIPPING STEMMER AND ANTS ALGORITHM FOR CLASSIFYING NEWS DOCUMENT IN INDONESIAN LANGUAGE

Ants algorithm is a universal and flexible solution which was first designed for solving optimization problem such as Traveling Salesman Problem. Analogy between finding the shortest way by ants and finding documents most alike, became a stimulus of ant based text document clustering method. This method consist of two phases, which are finding documents most alike (trial phase) and clusters making (dividing phase). In this paper, we implemented ant based document clustering method on 253 news documents in Indonesian language. Beside that, we developed enhanced confix stripping stemmer as an improvement of confix stripping stemmer for stemming news documents in Indonesian language. Result of the experiments proved that ants algorithm can be applied for classification of news document in Indonesian language, with the best Fmeasure achieved from experiments was 0.86. The experiments also showed that enhanced confix stripping stemmer had been succesfully solved confix stripping stemmer’s problems and reduce terms size up to 32.66%, while confix stripping stemmer only reduce 30.95%.