A life cycle model of XML documents

Electronic documents produced in business processes are valuable information resources for organizations. In many cases they have to be accessible long after the life of the business processes or information systems in connection with which they were created. To improve the management and preservation of documents, organizations are deploying Extensible Markup Language (XML) as a standardized format for documents. The goal of this paper is to increase understanding of XML document management and provide a framework to enable the analysis and description of the management of XML documents throughout their life. We followed the design science approach. We introduce a document life cycle model consisting of five phases. For each of the phases we describe the typical activities related to the management of XML documents. Furthermore, we also identify the typical actors, systems, and types of content items associated with the activities of the phases. We demonstrate the use of the model in two case studies: one concerning the State Budget Proposal of the Finnish government and the other concerning a faculty council meeting agenda at a university.

[1]  C. M. Sperberg-McQueen,et al.  eXtensible Markup Language (XML) 1.0 (Second Edition) , 2000 .

[2]  James D. McKeen,et al.  Developments in Practice VIII: Enterprise Content Management , 2003, Commun. Assoc. Inf. Syst..

[3]  Ralph H. Sprague,et al.  Electronic Document Management: Challenges and Opportunities for Information Systems Managers , 1995, MIS Q..

[4]  Carole D. Hafner,et al.  The State of the Art in Ontology Design: A Survey and Comparative Review , 1997, AI Mag..

[5]  Andrew J. S. Wilson,et al.  An Approach to the Preservation of Digital Records , 2002 .

[6]  Mounia Lalmas,et al.  Extended structural relevance framework: a framework for evaluating structured document retrieval , 2012, Information Retrieval.

[7]  Elisa Bertino,et al.  Controlled and cooperative updates of XML documents in byzantine and failure-prone distributed systems , 2006, TSEC.

[8]  Jay F. Nunamaker,et al.  Systems Development in Information Systems Research , 1990, J. Manag. Inf. Syst..

[9]  D. Stephens The Sarbanes‐Oxley Act , 2005 .

[10]  Xueying Wang,et al.  Research on the Enterprise' Model of Information Lifecycle Management Based on Enterprise Architecture , 2009, 2009 Ninth International Conference on Hybrid Intelligent Systems.

[11]  Kyong-Ho Lee,et al.  Standardization of eBook documents in the Korean industry , 2002, Comput. Stand. Interfaces.

[12]  Gianni Costa,et al.  Hierarchical clustering of XML documents focused on structural components , 2013, Data Knowl. Eng..

[13]  Torben Bach Pedersen,et al.  Integrating Data Warehouses with Web Data: A Survey , 2008, IEEE Transactions on Knowledge and Data Engineering.

[14]  Dirk Roorda,et al.  Migration to Intermediate XML for Electronic Data (MIXED): Repository of Durable File Format Conversions , 2011, Int. J. Digit. Curation.

[15]  Minder Chen,et al.  Factors affecting the adoption and diffusion of XML and Web services standards for E-business systems , 2003, Int. J. Hum. Comput. Stud..

[16]  Eve Maler,et al.  Developing SGML DTDs: From Text to Model to Markup , 1995 .

[17]  Charles F. Goldfarb,et al.  SGML handbook , 1990 .

[18]  Jay Atherton,et al.  From Life Cycle to Continuum: Some Thoughts on the Records Management–Archives Relationship , 1985 .

[19]  Shirley Gregor,et al.  The Anatomy of a Design Theory , 2007, J. Assoc. Inf. Syst..

[20]  Samir Chatterjee,et al.  A Design Science Research Methodology for Information Systems Research , 2008 .

[21]  Jay P. Kesan,et al.  Implementing open standards: a case study of the Massachusetts open formats policy , 2008, DG.O.

[22]  Elizabeth Reuben Migrating records from proprietary software to RTF, HTML, and XML , 2003 .

[23]  Uwe M. Borghoff,et al.  Versioning XML-based office documents , 2009, Multimedia Tools and Applications.

[24]  Airi Salminen,et al.  Putting documents into their work context in document analysis , 2000, Inf. Process. Manag..

[25]  Rhonda S. Lunemann,et al.  Managing Enterprise Content: a Unified Content Strategy , 2003 .

[26]  Jane Greenberg,et al.  Metadata and Digital Information , 2010 .

[27]  Ray Bernard,et al.  Information Lifecycle Security Risk Assessment: A tool for closing security gaps , 2007, Comput. Secur..

[28]  Airi Salminen,et al.  Towards a Methodology for Document Analysis , 1997, J. Am. Soc. Inf. Sci..

[29]  Airi Salminen,et al.  Building Digital Government by XML , 2005, Proceedings of the 38th Annual Hawaii International Conference on System Sciences.

[30]  Robert M. Barker,et al.  The legal implications of electronic document retention: Changing the rules , 2009 .

[31]  Richard F. Paige,et al.  Document‐centric XML workflows with fragment digital signatures , 2010, Softw. Pract. Exp..

[32]  Tom Brier,et al.  Communications of the Association for Information Systems , 1999 .

[33]  J. Crawford,et al.  Setting the stage. , 2021, The New England journal of medicine.

[34]  Monica Palmirani,et al.  Legislative XML for the Semantic Web: Principles, Models, Standards for Document Management , 2011 .

[35]  Airi Salminen,et al.  Content Production Strategies for E-Government , 2008 .

[36]  Linda Volonino Electronic Evidence and Computer Forensics , 2003, Commun. Assoc. Inf. Syst..

[37]  OpenDocument Schema Information technology — Open Document Format for Office Applications (OpenDocument) v1.2 — , 2015 .

[38]  Salvatore T. March,et al.  Design and natural science research on information technology , 1995, Decis. Support Syst..

[39]  Elisa Bertino,et al.  Secure and selective dissemination of XML documents , 2002, TSEC.

[40]  Andreas Stanescu,et al.  Assessing the Durability of Formats in a Digital Preservation Environment: The INFORM Methodology , 2005, D Lib Mag..

[41]  Lerina Aversano,et al.  Integrating document and workflow management tools using XML and web technologies: a case study , 2002, Proceedings of the Sixth European Conference on Software Maintenance and Reengineering.

[42]  Alon Peled,et al.  When transparency and collaboration collide: The USA Open Data program , 2011, J. Assoc. Inf. Sci. Technol..

[43]  Fusheng Wang,et al.  Temporal queries and version management in XML-based document archives , 2008, Data Knowl. Eng..

[44]  Anita Mirijamdotter,et al.  The Changing Nature of Archives : Whose Responsibility? , 2011 .

[45]  Alan R. Hevner,et al.  Design Science in Information Systems Research , 2004, MIS Q..

[46]  Mounia Lalmas,et al.  A framework for the theoretical evaluation of XML retrieval , 2012, J. Assoc. Inf. Sci. Technol..

[47]  Terrence A. Brooks,et al.  Where is meaning when form is gone? Knowledge representation on the Web , 2001, Inf. Res..

[48]  Elisa Bertino,et al.  X-GTRBAC: an XML-based policy specification framework and architecture for enterprise-wide access control , 2005, TSEC.

[49]  Frank Upward,et al.  Modelling the continuum as paradigm shift in recordkeeping and archiving processes, and beyond Ö a personal reflection , 2000 .

[50]  Andy Powell,et al.  Guidelines for implementing Dublin Core in XML , 2003 .

[51]  Alfonso Fuggetta,et al.  Open standards, open formats, and open source , 2007, J. Syst. Softw..

[52]  Airi Salminen,et al.  XML document implementation: Experiences from three cases , 2007, 2007 2nd International Conference on Digital Information Management.

[53]  Sue McKemmish Placing records continuum theory and practice , 2001 .

[54]  Vincent Quint,et al.  Impact of XML Schema Evolution , 2011, TOIT.

[55]  Oasis RELAX NG Specification , 2001 .

[56]  James Allan,et al.  A survey in indexing and searching XML documents , 2002, J. Assoc. Inf. Sci. Technol..

[57]  Elisa Bertino,et al.  A system for securing push-based distribution of XML documents , 2007, International Journal of Information Security.

[58]  Peter Mertens,et al.  Memorandum on design-oriented information systems research , 2011, Eur. J. Inf. Syst..

[59]  Frank Wm. Tompa,et al.  Communicating with XML , 2011 .

[60]  Ari-Veikko Anttiroiko,et al.  Electronic Government: Concepts, Methodologies, Tools and Applications , 2008 .

[61]  Gordon B. Davis,et al.  IS '97: model curriculum and guidelines for undergraduate degree programs in information systems , 1996, IS '97.

[62]  Airi Salminen Modeling Documents in Their Context , 2010 .

[63]  Airi Salminen,et al.  Semantic Portal for Legislative Information , 2006, EGOV.

[64]  Airi Salminen,et al.  Implementing Digital Government in the Finnish Parliament , 2005 .

[65]  Terence K. Huwe,et al.  Recycling and redirecting Web content for fun and profit , 2003 .

[66]  MATS BROBERG A Successful Documentation Management System Using XML , 2004 .

[67]  Erik Borglund,et al.  Design for Recordkeeping: Areas of Improvement , 2008 .

[68]  Laura M. Haas,et al.  Information integration in the enterprise , 2008, CACM.

[69]  Shaofeng Liu,et al.  A review of structured document retrieval (SDR) technology to improve information access performance in engineering document management , 2008, Comput. Ind..

[70]  Jaana Kekäläinen,et al.  Contextualization models for XML retrieval , 2011, Inf. Process. Manag..

[71]  Frank Wm. Tompa,et al.  Requirements for XML document database systems , 2001, DocEng '01.

[72]  Michael Rys XML Document , 2009, Encyclopedia of Database Systems.

[73]  Tero Päivärinta,et al.  User Needs for Electronic Document Management in the Public Administration: A Study of Two Cases , 2000, ECIS.

[74]  Elisa Bertino,et al.  A New Model for Secure Dissemination of XML Content , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[75]  P. Durusau,et al.  ISO/IEC 26300-1:2015, Information technology -- Open Document Format for Office Applications (OpenDocument) v1.2 -- Part 1: OpenDocument Schema , 2015 .

[76]  Jarek Gryz,et al.  Transforming XML documents as schemas evolve , 2010, Proc. VLDB Endow..

[77]  Linda Volonino,et al.  Managing the Lifecycle of Electronically Stored Information , 2007, Inf. Syst. Manag..