Efficient and Effective XML Encoding

Compared to other middleware approaches like CORBA or Java RMI the protocol overhead of SOAP is very high. This fact is not only disadvantageous for several performance-critical applications, but especially in environments with limited network bandwidth or resource-constrained computing devices. Although recent research work concentrated on more compact, binary representations of XML data only very few approaches account for the special characteristics of SOAP communication. This chapter will discuss the most relevant state-of-the-art technologies for compressing XML data. Furthermore, it will present a novel solution for compacting SOAP messages. In order to achieve significantly better compression rates than current approaches, the compressor described in this chapter utilizes structure information from an XML Schema or WSDL document. With this additional knowledge on the “grammar” of the exchanged messages, this compressor generates a single custom pushdown automaton, which can be used as a highly efficient validating parser as well as a highly effective compressor. The main idea is to tag the transitions of the automaton with short binary identifiers that are then used to encode the path through the automaton during parsing. The authors’ approach leads to extremely compact data representations and is also usable in environments with very limited CPU and memory resources. DOI: 10.4018/978-1-61520-684-1.ch011