A pattern based tokenization model for XML parsing on mobile devices

This paper presents a theoretical tokenization model for XML parsing on resource constrained mobile devices. The model is based on the identification of sequentially repeating patterns within the structure of an XML document. As soon as it identifies a repeating structure, it relieves the parser from the computationally intensive conventional tokenization process, and focuses on extracting text node based values for further processing by the calling application. Our experiments demonstrate that the proposed tokenization model considerably relieves the processing bottlenecks encountered in conventional XML parsers.

[1]  Krste Asanovic,et al.  Energy Aware Lossless Data Compression , 2003, MobiSys.

[2]  Ying Zhang,et al.  Hybrid Parallelism for XML SAX Parsing , 2008, 2008 IEEE International Conference on Web Services.

[3]  Sasu Tarkoma,et al.  Xebu: A Binary Format with Schema-Based Optimizations for XML Data , 2005, WISE.

[4]  Abraham Heifets,et al.  XML screamer: an integrated approach to high performance XML parsing, validation and deserialization , 2006, WWW '06.

[5]  Jyh-Charn Liu,et al.  XML Document Parsing: Operational and Performance Characteristics , 2008, Computer.

[6]  Xiaojie Yuan,et al.  Incremental Validation of XML Document Based on Simplified XML Element Sequence Pattern , 2010, 2010 Seventh Web Information Systems and Applications Conference.

[7]  Dan Suciu,et al.  XMill: an efficient compressor for XML data , 2000, SIGMOD '00.

[8]  Xiaojie Yuan,et al.  Schemas Extraction for XML Documents by XML Element Sequence Patterns , 2009, 2009 First International Conference on Information Science and Engineering.

[9]  Dong Zhou Exploiting Structure Recurrence in XML Processing , 2008, 2008 Eighth International Conference on Web Engineering.

[10]  James Cheney Compressing XML with multiplexed hierarchical PPM models , 2001, Proceedings DCC 2001. Data Compression Conference.

[11]  Ying Zhang,et al.  A Static Load-Balancing Scheme for Parallel XML Parsing on Multicore CPUs , 2007, Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07).

[12]  Wei Lu,et al.  BXSA for fast processing of scientific data , 2007, SpringSim '07.