Multialphabet coding with separate alphabet description

For lossless universal source coding of memoryless sequences with an a priori unknown alphabet size (multialphabet coding), the alphabet of the sequence must be described as well as the sequence itself. Usually an efficient description of the alphabet can be made only by taking into account some additional information. We show that these descriptions can be separated in such a way that the encoding of the actual sequence can be performed independently of the alphabet description, and present sequential coding methods for such sequences. Such methods have applications in coding methods where the alphabet description is made available sequentially, such as PPM.