An XML-based language for the Research & Development pipeline management problem

Abstract Process management frameworks, such as Sim-Opt [AIChE J. 10 (2001) 2226], which addresses the Research & Development (R&D) pipeline management problem with mathematical programming and discrete-event simulation give rise to formulations that are extremely data-intensive and have complex hierarchical data-requirements. This necessitates a data model that can be used to model any given problem instance in the form of a structured input language. Further, the language requires a parser that reads and interprets any input instance in order to capture the input data in memory and allow the formulation and solution of the corresponding optimization and simulation models. In the past, structured documentation languages have been designed for this purpose. However, such customized languages often lead to a strong coupling between the language definition and the parser implementation. Any redefinition or extension of the language to accommodate changes in the problem scope and/or optimization/simulation formulations would imply a customized extension of the parser, thus leading to software engineering difficulties. One solution to the above difficulties is provided by the Extensible Markup Language (XML) technology, a recent advance in software technology that enables extensibility and data abstraction and provides efficient data structuring parsers and object orientation. XML imposes the requirement of specifying data in an inherently hierarchical structure and provides generic parsers that do not require any re-design upon language extensions or redefinitions. This paper describes an XML-based language that has been developed for the R&D pipeline management problem, with the keywords, structural syntax, and data content models for representing all aspects of the problem. It also discusses the practical issue of effectively accessing data that gets stored in the document object model (DOM) upon parsing, by designing a set of problem definition classes (PDC), which organize the data stored in the generic DOM structure into an effective set of data structures that facilitate formulation generation. Efforts to integrate the language, the DOM parser, and the PDC in a discrete event simulation application for the R&D pipeline problem are also discussed.