A novel way of lossless compression of digital mammograms using grammar codes

Breast cancer is the most common cancer among women in Canada. Despite slight declines in mortality rates over the past decade for women with breast cancer, one in nine Canadian women will develop breast cancer in her lifetime; one in 25 Canadian women will die from this disease. Digital mammograms (X-rays of the breast) may allow better cancer diagnosis and has the ability to be transmitted electronically around the world. The problem is mammograms are large size images and have less correlation details. Therefore, for a physician to diagnose diseases correctly even through the communication networks, gaining higher compression to save bandwidth without any data loss becomes a challenging issue. Among the traditional lossless compression algorithms such as Huffman, Lempel-Ziv and arithmetic, Lempel-Ziv and arithmetic source coding techniques have better performance than Huffman on digital mammograms. In order to achieve better compression ratios we investigate the newly developed grammar-based source code for medical image compression such as mammograms. In this grammar-based code, the original data (image) is first transformed into a context free grammar, from which the original data sequence can be fully reconstructed by performing parallel and recursive substitutions, and then using an arithmetic coding algorithm to compress the context free grammar or the corresponding sequence of parsed phrases. We tested the grammar-based coding technique on digital mammograms obtained from the Mammographic Image Analysis Society (MIAS). The result shows the newly developed grammar code performs better than the traditional lossless coding schemes. In general, the grammar-based lossless compression algorithm seems to be a promising technique for teleradiology applications.