Improving the performance of XML based technologies by caching and reusing information

The growing synergy between Web services and grid-based technologies is enabling profound, dynamic interactions between applications dispersed in geographic, institutional, and conceptual space. Such deep interoperability requires the simplicity, robustness, and extensibility for which XML has been conceived, making it a natural lingua franca for the network. Along with these advantages, there is a degree of inefficiency that may limit the applicability of XML. Firstly, we investigate the limitations of XML for high-performance and high-interactive distributed computing. Our experimental results clearly show that focusing on parsers, that are routinely used to desterilize XML messages exchanged in these system, we can improve the performance of a generic end to end Web services based solution. Secondly we present a new parser, the cache parser, which uses a cache to reduce the parsing time sender and receiver side, by reusing information related to previously parsed documents/messages similar to the one under examination. Finally, we show how our fast parser can improve the global throughput of any application based on Web or grid services, or also JAXP-RPC. Experimental results demonstrate that our algorithm is 25 times faster than the fastest algorithm in the market and, if used in a WS scenario, can dramatically increase the number of requests per second handled by a server (up to 150% of improvement) bringing it close to a system that does not use XML at all

[1]  Madhusudhan Govindaraju,et al.  Investigating the limits of SOAP performance for scientific computing , 2002, Proceedings 11th IEEE International Symposium on High Performance Distributed Computing.

[2]  Michiaki Tatsubori,et al.  An adaptive, fast, and safe XML parser based on byte sequences memorization , 2005, WWW '05.

[3]  Pu Liu,et al.  A Benchmark Suite for SOAP-based Communication in Grid Web Services , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[4]  Manish Parashar,et al.  Latency Performance of SOAP Implementations , 2002, 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'02).

[5]  Arnaud Le Hors,et al.  Document Object Model (DOM) Level 2 Core Specification - Version 1.0 , 2000 .

[6]  Satoshi Matsuoka,et al.  Evaluating Web services based implementations of GridRPC , 2002, Proceedings 11th IEEE International Symposium on High Performance Distributed Computing.

[7]  Brian N. Bershad,et al.  Efficient Packet Demultiplexing for Multiple Endpoints and Large Messages , 1994, USENIX Winter.

[8]  Virgílio A. F. Almeida,et al.  Capacity Planning for Web Performance: Metrics, Models, and Methods , 1998 .

[9]  Michael J. Lewis,et al.  Differential Deserialization for Optimized SOAP Performance , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[10]  Kenneth Chiu,et al.  A Compiler-Based Approach to Schema-Specific XML Parsing , 2003 .

[11]  D. Box,et al.  Simple object access protocol (SOAP) 1.1 , 2000 .

[12]  Ian J. Taylor From P2P to Web Services and Grids - Peers in a Client/Server World , 2005, Computer Communications and Networks.

[13]  Mun Choon Chan,et al.  Cache-based compaction: a new technique for optimizing Web transfer , 1999, IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320).

[14]  Joe Marini,et al.  Document Object Model , 2002, Encyclopedia of GIS.