apE—the original dataflow visualization environment

One of the first organized efforts to apply computer graphics technology to creating a scientific visualization toolkit was under, ken by the Ohio 5upercompurer Graphics Project, resulting in the apE TM software. The dataflow model is the system's bugs for both the highlevel visual programming paradigm and the internal realization. It is designed for heterogeneous environments and is portable across a wide range of platforms, With its very advanced data management language, synchronization concepts, and many visualization modules, apE TM was ahead of its time in the late 19B0s and its influence is apparent in leading application builders today. How it All Began There was a strong demand for solutions in the field of scientific visualization in the early 80's. Deskside graphics workstations became available, faster supercomputers allowed significantly more accurate simulations, and improved measuring devices provided more and increasingly precise dam. 5cientisr.s recognized the value of visual tools to extract meaning from the now extremely complex datasets and searched for visualization software to support them [18]. In 1987, the Ohio Supercomputer Center was installed at the Ohio State University to serve academic and industrial users throughout the state of Ohio. It started with a Cray X-MR which was followed by a Y-HP in 1989. In the year of the center's installation, the Ohio 5upercomputer Graphics Project (OSGP) was formed as the graphics research component. It consisted exclusively of fulltime professional graphics software experts. Already in 1787, OSGP began to design apE TM to be a complete and effective scientific visualization system addressing the real needs of users and to grow with its applications [5]. 'apE TM was originally an abbreviation for 'animation production Environment;' the word 'Environment' was capitalized because it was the main emphasis. Later, the acronym was dropped as the developers' ream and most users strongly personalized the project. Concepts While building apE TM, all major design decisions were based upon a very thorough analysis of the targeted applications, environments, and platforms. State of the art computer ~q-aphics algorithms were questioned in depth and the entire system was implemented from the ground up with scientific visualization in mind. Arising problems and critical points like errors introduced by screen-space color interpolation were r~ckied with much professional effort by OSGP. finally, apE TM could bridge the gap between the well-understood computational and graphical domains with visualization. (see also [6] [8]) Flexibility One of the major software engineering goals was flexibiliqc, at the application builder level, at the module level, at the programming level, and at the portability level. apE TM is flexible to use. As an application builder, it allows users to easily assemble applications out of existing modules, which distinguishes this class from turnkey visuali~tion systems chat are built and optimized for one specific task. The integrated modules themselves are flexible, apE TM marked a great step towards interactive visualization for the exploration of data. It is flexible for programmers. Existing modules can he examined and even modified. New communication methods, modules, user interface elements, or datascructures can be added by the user community. Portability and a wide range of platform support further added tremendous flexibility. Dataflow Model The visualization process can be modeled as a dataflow pipeline from the data source through steps of filtering, mapping, renderin~ and finally displaying [12]. This paradigm has been adopted by most application builders like Khoros, AV$, IBM Data Explorer, IRIS Explorer, and apE TM. While some systems are implemented to be tmally or partly monolithic, apE TM is realized as a datafiow system internally as well as externally. In apE TM, the application building tool ~/Vrench' allows the user to build pipelines in a dr~-anddrop manner, selecting modules from the 'Palette' out of the following groups: data source, data creation, data manipulation, inrerf'ace, rendering, and utility. The interactively chosen connections determine which module sends its output to which module sends its output to which ocher module(s). Even very complex configurations are allowed, including multi-input and multi-output branches. Loops are controllable, which is important for simulation steering based upon visual feedback [I]. However, Wrench does not check the syntactical correcmess of attempted connections by the user as other systems do. Wrong connections can only be detected at the time of pipeline invocation. On the system level, every module is a separate UNIX TM process. The supervising Wrench executes the modules on the chosen host informing each about its specific links within the pipeline. The main advantages of basing the internal system architecture on the dataflow model clearly are concurrence, load balancing, improved reliability, and resource sharing. Furthermore, with separate modules and clearly specified communication interfaces, the flexible adding or modifying of modules is possible. Distributed execution of modules and coarse grain parallelization for maximum resource utilization can be reliably realized. The data-driven daraflow in apE TM (versus demand-driven) supports this parallelization with time-dependent or multi-frame datasets. The main disadvantages are the potentially inefficient data transfer between individual modules, the sometimes unavoidable data duplication, and the dominating one-way-ness of the dataflow. In addition, apE TM (like all true dataflow-oriented systems) does not achieve a very good overall computing performance. For that reason OSGP considered developing some kind of 'compiler' to optimize constructed pipelines before invocation. Inter-Module Communication Due to the heterogeneous hardware environmeres for which apE TM was built, the dam format for inter-module communication had to be well chosen. Incompatible binary formats between different machines prevented the use of a raw binary format and ASCII data transfer is too slow for larger dacasets. In order to solve this especially serious problem, the powerful dataflow language Computer Graphi,'~ May 1995 S 'Flux' ('Flow' in Version I.I) for the description of manifold scientific datasets and geometrical objects was developed [9]. Flux supports ASCII, IEEE, XDR (Sun), and native formats. Compression techniques (UNIX tools like zcat) were only integrated for reading data into or writing out of the pipeline, since compressing and un-compressing between modules proved inefficient. Every data element of apE TM can be represented in the Flux format-even type in field values or the fully indexed online documentation. User-designed grouping is also supported. Indirect references to one Flux object from others are possible. With Flux, objects from the scientific and graphical domain can be described in the same language. The actual transmission of bytes between pairs of modules happens through UNIX pipes or sockets. Most modules can also be run connected in batch mode using sockets and shell scripts. Hardware Independence and Portability Close analysis of the targeted market (visualization laboratories, universities, and industry) revealed the importance of hardware independence, portabil i ty across many platforms, remote execution, and autonomy from window systems or graphic libraries. The operating system decision was also based on this analysis. With the rapidly growing number of workstations in the mid-1980s UNIX became a de-facto standard, although the vendor-specific versions varied much more than today. The success of apE TM is also the result of tailoring the software to this specific type of market by supporting Convex, Cray, Silicon Graphics, Sun, Hewlett-Packard, NEXT, DEC, Stardent, and IBM platforms. Together with the possibility of remote execution, this allows apE TM to make use of distributed computing power from many local workstations and supercomputers [20]. The portability of apE TM was achieved by layering the machine-dependent and UNIX functions in the 'Platform' library. Also, features missing on some systems were implemented here. A collection of shell scripts, the 'Forge' library, help users turn apE TM source (about 500,000 lines of code) into executables in a plug-and-play manner. These scripts need not to be compiled and only assume the existence of '/usr/csh' and 'cc' on the target machine. Finally, the 'apE TM' library, based on Flux, supports the development of modules by embedding them in a communication layer that reads, parses, and writes dataset groups in the pipeline. User Interface Independence When OSGP started to design apE TM, the workstation market was even more volatile than today, and the X Window System could not yet be seen as the winner of several competing window systems. There were different X toolkits (Motif, OpenLook) that were not compatible or available on all systems. Instead of building apE TM upon one of them and, thus, directly connecting its fate with the one window system or toolkit, OSGP decided to implement a new user interface layer named 'Face' that could be successfully executed under SunView, X, GL, and NeXTStep Display PostScript. Face is a powerful generic user interface library that supports software development that is independent of the windowing system. Even the user input and event notification schemes of those window systems were abstracted in Face. The programmer is offered all standard user interface items like buttons, menus, sliders, or type-in fields and higher level elements such as browsers or alert popups. With Face, the user interface of ape TM has the same look and feel across a wide range of platforms, apE TM can also be ported easily to new window systems by just adapting Face. The proof for this was given, when within a few weeks porting