At present, the World Wide Web is overburdened with an enormous amount of different documents, which vary in their structure, content and representation. In such conditions, the maintenance of the existing documents becomes a task that is extremely difficult to be performed due to the costs and efforts it requires. The overall objective is to show the need in automation of the process of evaluation and updating of Web resources, thus considerably reducing the time needed for such operations and improving the quality of the code. In order to achieve this, the thesis illustrates the development of a tool that can analyse and transform Web documents. The tool that is being proposed makes possible the maintenance of existing documents by providing basically two sets of services. The first one analysis services allows for the investigation of the documents' structure and content, thus detecting and extracting the information, that might be usefull for the later updates. The services from the second group enable the actual modifications of the documents, offering by this the machine-conducted way of their transformation. The core functionality of the tool is supplemented with the features that make possible reading of the documents in various formats and outputting them in the desired way. The paper presents the design and implementation of XML Recoder as a tool for analysis and tranformation of Web documents as well as the evaluation of the tool's functionality and the results of the chosen case studies. Department of Computer and Information Science Linköpings Universitet SE-581 83 Linköping, Sweden
[1]
Clemens A. Szyperski,et al.
Component software - beyond object-oriented programming
,
2002
.
[2]
Uwe Aßmann,et al.
Invasive Software Composition
,
2003,
Springer Berlin Heidelberg.
[3]
Bernhard Nebel,et al.
Reasoning and Revision in Hybrid Representation Systems
,
1990,
Lecture Notes in Computer Science.
[4]
Uwe Aamann.
Optimix { a Tool for Rewriting and Optimizing Programs
,
1998
.
[5]
Dave Raggett.
Clean Up Your Web Pages with HTML TIDY
,
1999
.
[6]
Ralph Johnson,et al.
design patterns elements of reusable object oriented software
,
2019
.
[7]
Hilla Peretz,et al.
Ju n 20 03 Schrödinger ’ s Cat : The rules of engagement
,
2003
.
[8]
Guy L. Steele,et al.
The Java Language Specification
,
1996
.