The Effects of XML Compression on SOAP Performance

XML is the foundation of the SOAP protocol, and in turn, Web Service communication. This self-descriptive textual format for structured data is renowned to be verbose. This verbosity can cause problems due to communication and processing overhead in resource-constrained environments (e.g., small wireless devices). In this paper, we compare different binary representations of XML documents. To this end, we propose a multifaceted and reusable test suite based on real-world scenarios. Our main result is that only simple XML compression methods are suitable for a wide range of scenarios. While these simple methods do not match the compression ratios of more specialized ones, they are still competitive in most scenarios. We also show that there are scenarios that none of the evaluated methods can deal with efficiently.

[1]  Chin-Wan Chung,et al.  XPRESS: a queriable compression for XML data , 2003, SIGMOD '03.

[2]  Simon Josefsson,et al.  The Base16, Base32, and Base64 Data Encodings , 2003, RFC.

[3]  Mark Levene,et al.  XML Structure Compression , 2002, WebDyn@WWW.

[4]  Mike Smith Sensor Model Language (SensorML) for In-situ and Remote Sensors , 2002 .

[5]  Wilfred Ng,et al.  Comparative Analysis of XML Compression Technologies , 2006, World Wide Web.

[6]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[7]  Neel Sundaresan,et al.  Algorithms and programming models for efficient representation of XML for Internet applications , 2002, Comput. Networks.

[8]  Jayant R. Haritsa,et al.  XGrind: a query-friendly XML compressor , 2002, Proceedings 18th International Conference on Data Engineering.

[9]  Venkatesh Choppella,et al.  Requirements for and Evaluation of RMI Protocols for Scientific Computing , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[10]  David A. Huffman,et al.  A method for the construction of minimum-redundancy codes , 1952, Proceedings of the IRE.

[11]  Peter Buneman,et al.  Edinburgh Research Explorer Path Queries on Compressed XML , 2022 .

[12]  Ian H. Witten,et al.  Arithmetic coding revisited , 1998, TOIS.

[13]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.

[14]  Mark Levene,et al.  XCQ: XML Compression and Querying System , 2003, WWW.

[15]  Ian H. Witten,et al.  Data Compression Using Adaptive Coding and Partial String Matching , 1984, IEEE Trans. Commun..

[16]  Francisco Curbera,et al.  Web services description language (wsdl) version 1. 2 , 2001 .

[17]  Giovanni Manzini,et al.  Invited Lecture: The Burrows-Wheeler Transform: Theory and Practice , 1999, MFCS.

[18]  Andreas Winter,et al.  An Overview of the GXL Graph Exchange Language , 2001, Software Visualization.

[19]  Smitha S. Nair XML Compression Techniques : A Survey , 2004 .

[20]  James Cheney Compressing XML with multiplexed hierarchical PPM models , 2001, Proceedings DCC 2001. Data Compression Conference.

[21]  Kent Beck,et al.  Contributing to Eclipse - principles, patterns, and plug-ins , 2003, The Eclipse series.

[22]  Robert Steele,et al.  Evaluating SOAP for High Performance Business Applications: Real-Time Trading Systems , 2003, WWW.

[23]  Manish Parashar,et al.  Latency Performance of SOAP Implementations , 2002, 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'02).

[24]  Karsten Schwan,et al.  Efficient Wire Formats for High Performance Computing , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[25]  Welf Löwe,et al.  Reuse in reverse engineering , 2004, Proceedings. 12th IEEE International Workshop on Program Comprehension, 2004..

[26]  Mike P. Papazoglou,et al.  Service-oriented computing: concepts, characteristics and directions , 2003, Proceedings of the Fourth International Conference on Web Information Systems Engineering, 2003. WISE 2003..

[27]  Peter Deutsch,et al.  DEFLATE Compressed Data Format Specification version 1.3 , 1996, RFC.

[28]  Peter Deutsch,et al.  GZIP file format specification version 4.3 , 1996, RFC.

[29]  Ian F. Akyildiz,et al.  Wireless sensor networks: a survey , 2002, Comput. Networks.

[30]  A. Watson,et al.  OMG (Object Management Group) architecture and CORBA (common object request broker architecture) specification , 2002 .

[31]  Neel Sundaresan,et al.  Millau: an encoding format for efficient representation and exchange of XML over the Web , 2000, Comput. Networks.

[32]  Giovanni Manzini,et al.  The Burrows-Wheeler Transform : Theory and Practice , 1999 .

[33]  C. M. Sperberg-McQueen,et al.  Extensible Markup Language (XML) , 1997, World Wide Web J..

[34]  Dan Suciu,et al.  XMill: an efficient compressor for XML data , 2000, SIGMOD '00.

[35]  Aladdin Enterprises,et al.  ZLIB Compressed Data Format Specification version 3.3 , 1996 .