Morphology based text compression

With the rapid growth of online information, the number of documents in electronic media is very common increased. Easy and quick access to this information gets more important for the purpose of text compression. In recent years, a portion of the work in the field of text compression covers study aimed to the morphological structure of the language. In this study, Turkish and English documents are compressed in the determination of the different decomposition methods and efficiency, this method has been to investigate the effects of compression. Turkish and English documents are parsed by using morphological structure. The next stage in the parsed document structure is applied to the compression process with Huffman compression method. As a result, created 10 different parsing techniques with which attempts were made on a different corpus.